NFDI4DS | UHH-SEMS - Publication Details

Xavier Carreras

ORCID: 0000-0001-7432-4540

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5020877978

Research Areas

Natural Language Processing Techniques
Topic Modeling
Machine Learning and Algorithms
Semantic Web and Ontologies
Text and Document Classification Technologies
Algorithms and Data Compression
Domain Adaptation and Few-Shot Learning
Text Readability and Simplification
Speech and dialogue systems
Blind Source Separation Techniques
Machine Learning and Data Classification
Sparse and Compressive Sensing Techniques
Neural Networks and Applications
Information Retrieval and Search Behavior
Control Systems and Identification
semigroups and automata theory
Face and Expression Recognition
Gaussian Processes and Bayesian Inference
Data Quality and Management
Advanced Graph Neural Networks
Multimodal Machine Learning Applications
Network Packet Processing and Optimization
Biomedical Text Mining and Ontologies
Spam and Phishing Detection
Digital Filter Design and Implementation

Universitat Politècnica de Catalunya
2004-2023

Bar-Ilan University
2021

University of Helsinki
2021

Tel Aviv University
2021

Technical University of Darmstadt
2021

University of Copenhagen
2021

Edinburgh Napier University
2021

Universitat Pompeu Fabra
2021

University of Amsterdam
2021

University of Antwerp
2021

Introduction to the CoNLL-2005 shared task

OPENALEX - Publications

Xavier Carreras Lluı́s Màrquez

In this paper we describe the CoNLL-2005 shared task on Semantic Role Labeling.We introduce specification and goals of task, data sets evaluation methods, present a general overview 19 systems that have contributed to providing comparative description results.

10.3115/1706543.1706571 article EN 2005-01-01

Boosting Trees for Anti-Spam Email Filtering

OPENALEX - Publications

Xavier Carreras Lluı́s Màrquez

This paper describes a set of comparative experiments for the problem automatically filtering unwanted electronic mail messages. Several variants AdaBoost algorithm with confidence-rated predictions [Schapire & Singer, 99] have been applied, which differ in complexity base learners considered. Two main conclusions can be drawn from our experiments: a) The boosting-based methods clearly outperform baseline learning algorithms (Naive Bayes and Induction Decision Trees) on PU1 corpus,...

10.48550/arxiv.cs/0109015 preprint EN other-oa arXiv (Cornell University) 2001-01-01

Semantic Role Labeling: An Introduction to the Special Issue

OPENALEX - Publications

Lluı́s Màrquez Xavier Carreras Kenneth C. Litkowski Suzanne Stevenson

Semantic role labeling, the computational identification and labeling of arguments in text, has become a leading task linguistics today. Although issues for this have been studied decades, availability large resources development statistical machine learning methods heightened amount effort field. This special issue presents selected representative work overview describes linguistic background problem, movement from theories to practice, major that are being used, an steps taken systems,...

10.1162/coli.2008.34.2.145 article EN cc-by-nc-nd Computational Linguistics 2008-06-01

Named Entity Extraction using AdaBoost

OPENALEX - Publications

Xavier Carreras Lluı́s Màrquez Lluís Padró

This paper presents a Named Entity Extraction (NEE) system for the CoNLL 2002 competition. The two main sub-tasks of problem, recognition (NER) and classification (NEC), are performed sequentially independently with separate modules. Both modules machine learning based systems, which make use binary AdaBoost classifiers.

10.3115/1118853.1118857 article EN 2002-01-01

Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

OPENALEX - Publications

Michael Collins Amir Globerson Terry Koo Xavier Carreras Peter L. Bartlett

Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, frequently used structured prediction problems. Efficient learning of parameters these is therefore an important problem, becomes a key factor when from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates applied to the convex dual either log-linear or max-margin objective function; both cases corresponds minimizing...

10.5555/1390681.1442791 article EN Journal of Machine Learning Research 2008-06-01

TAG, dynamic programming, and the perceptron for efficient, feature-rich parsing

OPENALEX - Publications

Xavier Carreras Michael J. Collins Terry Koo

We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows rich set parse-tree features, including PCFG-based bigram and trigram dependency surface features. A severe challenge applying such an syntactic is efficiency algorithms involved. show efficient training feasible, using Tree Adjoining Grammar (TAG) based formalism. lower-order model used restrict search...

10.3115/1596324.1596327 article EN 2008-01-01

An efficient projection for l 1 , ∞ regularization

OPENALEX - Publications

Ariadna Quattoni Xavier Carreras Michael Collins Trevor Darrell

In recent years the l1, ∞ norm has been proposed for joint regularization. essence, this type of regularization aims at extending l1 framework learning sparse models to a setting where goal is learn set jointly models. paper we derive simple and effective projected gradient method optimization regularized problems. The main challenge in developing such resides on being able compute efficient projections ball. We present an algorithm that works O(n log n) time O(n) memory n number parameters....

10.1145/1553374.1553484 article EN 2009-06-14

Spectral learning of weighted automata

OPENALEX - Publications

Borja Balle Xavier Carreras Franco M. Luque Ariadna Quattoni

10.1007/s10994-013-5416-x article EN Machine Learning 2013-10-02

Combination Strategies for Semantic Role Labeling

OPENALEX - Publications

Mihai Surdeanu Luis Alejandro Márquez–Martínez Xavier Carreras Pere R. Comas

This paper introduces and analyzes a battery of inference models for the problem semantic role labeling: one based on constraint satisfaction, several strategies that model as meta-learning using discriminative classifiers. These classifiers are developed with rich set novel features encode proposition sentence-level information. To our knowledge, this is first work that: (a) performs thorough analysis learning-based labeling, (b) compares in context. We evaluate proposed framework...

10.1613/jair.2088 article EN cc-by Journal of Artificial Intelligence Research 2007-06-14

An empirical study of semi-supervised structured conditional models for dependency parsing

OPENALEX - Publications

Jun Suzuki Hideki Isozaki Xavier Carreras Michael Collins

This paper describes an empirical study of high-performance dependency parsers based on a semi-supervised learning approach. We describe extension structured conditional models (SS-SCMs) to the parsing problem, whose framework is originally proposed in (Suzuki and Isozaki, 2008). Moreover, we introduce two extensions related parsing: The first combine SS-SCMs with another approach, described (Koo et al., second apply approach second-order models, such as those (Carreras, 2007), using...

10.3115/1699571.1699585 article EN 2009-01-01

A simple named entity extractor using AdaBoost

OPENALEX - Publications

Xavier Carreras Lluı́s Màrquez Lluís Padró

Article Free Access Share on A simple named entity extractor using AdaBoost Authors: Xavier Carreras Universitat Politècnica de Catalunya CatalunyaView Profile , Lluís Màrquez Padró Authors Info & Claims CONLL '03: Proceedings of the seventh conference Natural language learning at HLT-NAACL 2003 - Volume 4May Pages 152–155https://doi.org/10.3115/1119176.1119197Online:31 May 2003Publication History 15citation537DownloadsMetricsTotal Citations15Total Downloads537Last 12 Months38Last 6 weeks0...

10.3115/1119176.1119197 article EN 2003-01-01

Exponentiated gradient algorithms for log-linear structured prediction

OPENALEX - Publications

Amir Globerson Terry Koo Xavier Carreras Michael Collins

Conditional log-linear models are a commonly used method for structured prediction. Efficient learning of parameters in these is therefore an important problem. This paper describes exponentiated gradient (EG) algorithm training such models. EG applied to the convex dual maximum likelihood objective; this results both sequential and parallel update algorithms, where updated online fashion. We provide convergence proof algorithms. Our analysis also simplifies previous on max-margin models,...

10.1145/1273496.1273535 article EN 2007-06-20

Joint Arc-factored Parsing of Syntactic and Semantic Dependencies

OPENALEX - Publications

Xavier Lluís Xavier Carreras Lluı́s Màrquez

In this paper we introduce a joint arc-factored model for syntactic and semantic dependency parsing. The role labeler predicts the full paths that connect predicates with their arguments. This process is framed as linear assignment task, which allows to control some well-formedness constraints. For part, define standard tree. Finally, employ dual decomposition techniques produce consistent predicate-argument structures while searching over large space of configurations. experiments on...

10.1162/tacl_a_00222 article EN cc-by Transactions of the Association for Computational Linguistics 2013-12-01

Boosting trees for clause splitting

OPENALEX - Publications

Xavier Carreras Lluı́s Màrquez

We present a system for the CoNLL-2001 shared task: clause splitting problem. Our approach consists in decomposing problem into combination of binary "simple" decisions, which we solve with AdaBoost learning algorithm. The whole is decomposed two levels, chained decisions per level. first level corresponds to parts 1 and 2 presented introductory document task. second part 3, decompose procedure.

10.3115/1117822.1117839 article EN 2001-01-01

Named entity recognition with document-specific KB tag gazetteers

OPENALEX - Publications

Will Radford Xavier Carreras James Henderson

We consider a novel setting for Named Entity Recognition (NER) where we have access to document-specific knowledge base tags.These tags consist of canonical name from (KB) and entity type, but are not aligned the text.We explore how use KB create gazetteers at inference time improve NER.We find that this kind supervision helps recognise organisations more than standard widecoverage gazetteers.Moreover, augmenting with information lets users specify fewer same performance, reducing cost.

10.18653/v1/d15-1058 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Coming Soon ...