NFDI4DS | UHH-SEMS - Publication Details

Montse Cuadros

ORCID: 0000-0002-3620-1053

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5073920793

Research Areas

Natural Language Processing Techniques
Topic Modeling
Sentiment Analysis and Opinion Mining
Semantic Web and Ontologies
Biomedical Text Mining and Ontologies
Advanced Text Analysis Techniques
Text Readability and Simplification
Data Quality and Management
Speech and dialogue systems
Hate Speech and Cyberbullying Detection
Internet Traffic Analysis and Secure E-voting
Spam and Phishing Detection
Text and Document Classification Technologies
Web Data Mining and Analysis
Linguistic Studies and Language Acquisition
Misinformation and Its Impacts
Spanish Linguistics and Language Studies
linguistics and terminology studies
Cybercrime and Law Enforcement Studies
Wikis in Education and Collaboration
Digital Marketing and Social Media
Personal Information Management and User Behavior
Software Engineering Research
Government, Law, and Information Management
Machine Learning in Healthcare

Vicomtech
2014-2025

Tencent (China)
2021

University of the Basque Country
2006-2020

Universitat Politècnica de Catalunya
2007-2009

Hate Speech Dataset from a White Supremacy Forum

OPENALEX - Publications

Ona De Gibert Naiara Pérez Aitor García Pablos Montse Cuadros

Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic. Due to the massive rise user-generated web content social media, amount hate also steadily increasing. Over past years, interest in online detection and, particularly, automation this task has continuously grown, along with societal impact phenomenon. This paper...

10.18653/v1/w18-5102 article EN cc-by 2018-01-01

W2VLDA: Almost unsupervised system for Aspect Based Sentiment Analysis

OPENALEX - Publications

Aitor García Pablos Montse Cuadros Germán Rigau

10.1016/j.eswa.2017.08.049 article EN Expert Systems with Applications 2017-09-01

Sentiment Analysis on Social Media

OPENALEX - Publications

Federico Neri Carlo Aliprandi Federico Capeci Montse Cuadros Taylor By

The Web is a huge virtual space where to express and share individual opinions, influencing any aspect of life, with implications for marketing communication alike. Social Media are consumersâ preferences by shaping their attitudes behaviors. Monitoring the activities good way measure customersâ loyalty, keeping track on sentiment towards brands or products. next logical arena. Currently, Facebook dominates digital space, followed closely Twitter. This paper describes Sentiment Analysis...

10.1109/asonam.2012.164 article EN 2012-08-01

Automatic analysis of textual hotel reviews

OPENALEX - Publications

Aitor García Pablos Montse Cuadros María Teresa Linaza

10.1007/s40558-015-0047-7 article EN Information Technology & Tourism 2015-12-22

Next Generation XR Systems-Large Language Models Meet Augmented and Virtual Reality

OPENALEX - Publications

Muhammad Zeshan Afzal Sk Aziz Ali Didier Stricker Peter Eisert Anna Hilsmann and 10 more

Extended Reality (XR) is evolving rapidly, offering new paradigms for humancomputer interaction. This position paper argues that integrating Large Language Models (LLMs) with XR systems represents a fundamental shift toward more intelligent, context-aware, and adaptive mixed-reality experiences. We propose structured framework built on three key pillars: (1) Perception Situational Awareness, (2) Knowledge Modeling Reasoning, (3) Visualization Interaction. believe leveraging LLMs within...

10.1109/mcg.2025.3548554 article EN cc-by IEEE Computer Graphics and Applications 2025-01-01

KnowNet

OPENALEX - Publications

Montse Cuadros Germán Rigau

This paper presents a new fully automatic method for building highly dense and accurate knowledge bases from existing semantic resources. Basically, the uses wide-coverage knowledge-based Word Sense Disambiguation algorithm to assign most appropriate senses large sets of topically related words acquired web. KnowNet, resulting knowledge-base which connects semantically-related concepts is major step towards autonomous acquisition raw corpora. In fact, KnowNet several times larger than any...

10.3115/1599081.1599102 article EN 2008-01-01

V3: Unsupervised Aspect Based Sentiment Analysis for SemEval2015 Task 12

OPENALEX - Publications

Aitor García Pablos Montse Cuadros Germán Rigau

This paper presents our participation in SemEval-2015 task 12 (Aspect Based Sentiment Analysis).We participated employing only unsupervised or weakly-supervised approaches.Our attempt is based on requiring the minimum annotated hand-crafted content, and avoids training a model using provided set.We use continuous word representations (Word2Vec) to leverage in-domain semantic similarities of words for many involved subtasks.

10.18653/v1/s15-2121 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2015-01-01

Quality assessment of large scale knowledge resources

OPENALEX - Publications

Montse Cuadros Germán Rigau

This paper presents an empirical evaluation of the quality publicly available large-scale knowledge resources. The study includes a wide range manually and automatically derived In order to establish fair neutral comparison, each resource is indirectly evaluated using same method on Word Sense Disambiguation task. framework selected has been Senseval-3 English Lexical Sample Task. empirically demonstrates that acquired resources surpass both in terms precision recall manually, combination...

10.3115/1610075.1610149 article EN 2006-01-01

NUBES: A Corpus of Negation and Uncertainty in Spanish Clinical Texts

OPENALEX - Publications

Salvador Lima Naiara Pérez Montse Cuadros Germán Rigau

This paper introduces the first version of NUBes corpus (Negation and Uncertainty annotations in Biomedical texts Spanish). The is part an on-going research currently consists 29,682 sentences obtained from anonymised health records annotated with negation uncertainty. article includes exhaustive comparison similar corpora Spanish, presents main annotation design decisions. Additionally, we perform preliminary experiments using deep learning algorithms to validate dataset. As far as know,...

10.48550/arxiv.2004.01092 preprint EN cc-by-nc-sa arXiv (Cornell University) 2020-01-01

V3: Unsupervised Generation of Domain Aspect Terms for Aspect Based Sentiment Analysis

OPENALEX - Publications

Aitor García Pablos Montse Cuadros Germán Rigau

This paper presents V3, an unsupervised system for aspect-based Sentiment Analysis when evaluated on the SemEval 2014 Task 4. V3 focuses generating a list of aspect terms new domain using collection raw texts from domain. We also implement very basic approach to classify into categories and assign polarities them.

10.3115/v1/s14-2148 article EN Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2014-01-01

Sensitive Data Detection and Classification in Spanish Clinical Text: Experiments with BERT

OPENALEX - Publications

Aitor García Pablos Naiara Pérez Montse Cuadros

Massive digital data processing provides a wide range of opportunities and benefits, but at the cost endangering personal privacy. Anonymisation consists in removing or replacing sensitive information from data, enabling its exploitation for different purposes while preserving privacy individuals. Over years, lot automatic anonymisation systems have been proposed; however, depending on type target language availability training documents, task remains challenging still. The emergence novel...

10.48550/arxiv.2003.03106 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Multilingual CALL Framework for Automatic Language Exercise Generation from Free Text

OPENALEX - Publications

Naiara Pérez Montse Cuadros

This paper describes a web-based application to design and answer exercises for language learning. It is available in Basque, Spanish, English, French. Based on open-source Natural Language Processing (NLP) technology such as word embedding models sense disambiguation, the enables users automatic create easily real time three types of exercises, namely, Fill-in-the-Gaps, Multiple Choice, Shuffled Sentences questionnaires. These are generated from texts users’ own choice, so they can train...

10.18653/v1/e17-3013 article EN cc-by 2017-01-01

Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English

OPENALEX - Publications

Naiara Pérez Pablo Accuosto Álex Bravo Montse Cuadros Eva Martínez-Garcia and 2 more

Abstract Motivation Biomedical literature is one of the most relevant sources information for knowledge mining in field Bioinformatics. In spite English being widely addressed language field; recent years, there has been a growing interest from natural processing community dealing with languages other than English. However, availability resources and tools appropriate treatment non-English texts lacking behind. Our research concerned semantic annotation biomedical Spanish language, which can...

10.1093/bioinformatics/btz853 article EN Bioinformatics 2019-11-13

SemEval-2007 task 16

OPENALEX - Publications

Montse Cuadros Germán Rigau

This task tries to establish the relative quality of available semantic resources (derived by manual or automatic means). The each large-scale knowledge resource is indirectly evaluated on a Word Sense Disambiguation task. In particular, we use Senseval-3 and SemEval-2007 English Lexical Sample tasks as evaluation bechmarks evaluate resource. Furthermore, trying be neutral possible with respect bases studied, apply systematically same disambiguation method all resources. A completely...

10.3115/1621474.1621489 article EN 2007-01-01

Coming Soon ...