- Natural Language Processing Techniques
- Topic Modeling
- Semantic Web and Ontologies
- Text Readability and Simplification
- linguistics and terminology studies
- Lexicography and Language Studies
- Tracheal and airway disorders
- Second Language Acquisition and Learning
- Speech and dialogue systems
- Spanish Linguistics and Language Studies
- Sentiment Analysis and Opinion Mining
- Cancer Immunotherapy and Biomarkers
- Translation Studies and Practices
- Cancer Genomics and Diagnostics
- Galician and Iberian cultural studies
- Basque language and culture studies
- Linguistic Studies and Language Acquisition
- Language, Metaphor, and Cognition
- Language and cultural evolution
- Linguistic Variation and Morphology
- Sports and Physical Education Studies
- Authorship Attribution and Profiling
- Web Data Mining and Analysis
- Mathematics, Computing, and Information Processing
- Interpreting and Communication in Healthcare
Universidade de Santiago de Compostela
2012-2024
Center for Research in Molecular Medicine and Chronic Diseases
2015-2024
University of Alicante
2022
Universidad Rey Juan Carlos
2017-2021
Universidade Federal do Rio Grande do Sul
2021
University of Sheffield
2021
Universidade da Coruña
2016-2020
Secretaria da Educação do Estado da Bahia
2020
San Antonio College
2020
Universitat Politècnica de Catalunya
2006-2019
This article describes a strategy based on naive-bayes classifier for detecting the polarity of English tweets.The experiments have shown that best performance is achieved by using binary between just two sharp categories: positive and negative.In addition, in order to detect tweets with without polarity, system makes use very basic rule searchs words within analysed tweets/texts.When provided lexicon multiwords it achieves 63% F-score.
Harish Tayyar Madabushi, Edward Gow-Smith, Marcos Garcia, Carolina Scarton, Marco Idiart, Aline Villavicencio. Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). 2022.
Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio. Proceedings of the 16th Conference European Chapter Association for Computational Linguistics: Main Volume. 2021.
This paper presents an exploration of different statistical association measures to automatically identify collocations from corpora in English, Portuguese, and Spanish. To evaluate the impact metrics we manually annotated with three syntactic patterns (adjective-noun, verb-object nominal compounds). We took advantage PARSEME 1.1 Shared Task by selecting a subset 155k tokens referred languages, which 1,526 corresponding Lexical Functions according Meaning-Text Theory. Using resulting...
La zona del Valle de Uco en la Provincia Mendoza, abarca tres departamentos: Tupungato, San Carlos y Tunuyán. intención este artículo es analizar el acceso a justicia que tienen los sectores denominamos “vulnerables”, tales como: tercera edad, las personas jóvenes, minorías sexuales, con escasos recursos económicos, gente trabajadora informalidad mujeres, entre muchos otros. En último caso ante avances legislación argentina, creemos perspectiva género encierra, por sí misma, una...
This paper presents LinguaKit, a multilingual suite of tools for analysis, extraction, annotation and linguistic correction, as well its integration into Big Data infrastructure. LinguaKit allows the user to perform different tasks such PoS-tagging, syntactic parsing, coreference resolution (among others), including applications relation sentiment summarization, extraction multiword expressions, or entity linking DBpedia. Most modules work in four languages: Portuguese, Spanish, English,...
Este artigo apresenta LinguaKit, uma suite multilingue de ferramentas análise, extração, anotação e correção linguísticas. LinguaKit permite realizar tarefas tão diversas como a lematização, etiquetagem morfossintática ou análise sintática (entre outras), incluindo também aplicações para sentimentos (ou minaria opiniões), extração termos multipalavra, concetual ligação recursos enciclopédicos tais DBpedia. A maior parte dos módulos funcionam quatro variedades linguísticas: português,...
Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
Abstract Idiomatic expressions are an integral part of human languages, often used to express complex ideas in compressed or conventional ways (e.g. eager beaver as a keen and enthusiastic person). However, their interpretations may not be straightforwardly linked the meanings individual components isolation this have impact for compositional approaches. In paper, we investigate what extent word representation models able go beyond combinations capture multiword expression idiomaticity some...
This paper presents a new strategy for multilingual collocation extraction which takes advantage of parallel corpora to learn bilingual word-embeddings. Monolingual candidates are retrieved using Universal Dependencies, while the distributional models then applied search equivalents elements each in target languages. The proposed method extracts not only with direct translation between languages, but also other cases where collocations two languages literal translations other. Several...
Abstract This paper addresses the feasibility of cross-lingual parsing with Universal Dependencies (UD) between Romance languages, analyzing its performance when compared to use manually annotated resources target languages. Several experiments take into account factors such as lexical distance source and varieties, impact delexicalization, combination different treebanks or adaptation language, among others. The results these evaluations show that direct application a parser from one...