NFDI4DS | UHH-SEMS - Publication Details

Alejandro Moreo

ORCID: 0000-0002-0377-1025

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5086354805

Research Areas

Topic Modeling
Natural Language Processing Techniques
Text and Document Classification Technologies
Machine Learning and Data Classification
Authorship Attribution and Profiling
Sentiment Analysis and Opinion Mining
Advanced Text Analysis Techniques
Imbalanced Data Classification Techniques
Machine Learning and Algorithms
Neural Networks and Applications
Scientific Computing and Data Management
Computational and Text Analysis Methods
Domain Adaptation and Few-Shot Learning
Semantic Web and Ontologies
Artificial Intelligence in Healthcare
Web Data Mining and Analysis
Data Analysis with R
Anomaly Detection Techniques and Applications
Hate Speech and Cyberbullying Detection
Image Enhancement Techniques
Advanced Statistical Methods and Models
Image and Video Quality Assessment
Ethics and Social Impacts of AI
Names, Identity, and Discrimination Research
Biomedical Text Mining and Ontologies

Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo"
2015-2024

Consorzio Roma Ricerche
2020-2024

Consorzio Pisa Ricerche
2016-2024

National Research Council
2024

Universitas Gunung Rinjani
2023

Institute of Scientific and Technical Information of China
2020-2022

Universidad de Granada
2012-2017

Hamad bin Khalifa University
2016

Lexicon-based Comments-oriented News Sentiment Analyzer system

OPENALEX - Publications

Alejandro Moreo Margarita Sánchez Romero Juan Luis Castro J.M. Zurita

10.1016/j.eswa.2012.02.057 article EN Expert Systems with Applications 2012-02-22

Distributional Random Oversampling for Imbalanced Text Classification

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

The accuracy of many classification algorithms is known to suffer when the data are imbalanced (i.e., distribution examples across classes severely skewed). Many applications binary text this type, with positive class interest far outnumbered by negative examples. Oversampling generating synthetic training minority class) an often used strategy counter problem. We present a new oversampling method specifically designed for classifying (such as text) which distributional hypothesis holds,...

10.1145/2911451.2914722 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2016-07-07

Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach

OPENALEX - Publications

Alessandro Fabris Andrea Esuli Alejandro Moreo Fabrizio Sebastiani

Algorithms and models are increasingly deployed to inform decisions about people, inevitably affecting their lives. As a consequence, those in charge of developing these must carefully evaluate impact on different groups people favour group fairness, that is, ensure determined by sensitive demographic attributes, such as race or sex, not treated unjustly. To achieve this goal, the availability (awareness) attributes evaluating is fundamental. Unfortunately, collecting storing often conflict...

10.1613/jair.1.14033 article EN cc-by Journal of Artificial Intelligence Research 2023-04-22

Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification.

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a "target'' domain when the only available training data belongs to different "source'' domain. In this paper we present Distributional Correspondence Indexing (DCI) method adaptation in sentiment classification. DCI derives term representations vector space common both domains where each dimension reflects its distributional correspondence pivot, i.e., highly predictive that behaves...

10.1613/jair.4762 article EN cc-by Journal of Artificial Intelligence Research 2016-01-20

Word-class embeddings for multiclass text classification

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

10.1007/s10618-020-00735-3 article EN Data Mining and Knowledge Discovery 2021-02-19

Tweet sentiment quantification: An experimental re-evaluation

OPENALEX - Publications

Alejandro Moreo Fabrizio Sebastiani

Sentiment quantification is the task of training, by means supervised learning, estimators relative frequency (also called “prevalence”) sentiment-related classes (such as Positive , Neutral Negative ) in a sample unlabelled texts. This especially important when these texts are tweets, since final goal most sentiment classification efforts carried out on Twitter data actually (and not individual tweets). It well-known that solving “classify and count” (i.e., classifying all items standard...

10.1371/journal.pone.0263449 article EN cc-by PLoS ONE 2022-09-16

The \textit{Questio de aqua et terra}: A Computational Authorship Verification Study

OPENALEX - Publications

Martina Leocata Alejandro Moreo Fabrizio Sebastiani

The Questio de aqua et terra is a cosmological treatise traditionally attributed to Dante Alighieri. However, the authenticity of this text controversial, due discrepancies with Dante's established works and absence contemporary references. This study investigates via computational authorship verification (AV), class techniques which combine supervised machine learning stylometry. We build family AV systems assemble corpus 330 13th- 14th-century Latin texts, we use comparatively evaluate...

10.48550/arxiv.2501.05480 preprint EN arXiv (Cornell University) 2025-01-07

Misspellings in Natural Language Processing: A survey

OPENALEX - Publications

Gianluca Sperduti Alejandro Moreo

This survey provides an overview of the challenges misspellings in natural language processing (NLP). While often unintentional, have become ubiquitous digital communication, especially with proliferation Web 2.0, user-generated content, and informal text mediums such as social media, blogs, forums. Even if humans can generally interpret misspelled text, NLP models frequently struggle to handle it: this causes a decline performance common tasks like classification machine translation. In...

10.48550/arxiv.2501.16836 preprint EN arXiv (Cornell University) 2025-01-28

Kernel density estimation for multiclass quantification

OPENALEX - Publications

Alejandro Moreo Pablo González Juan José del Coz

10.1007/s10994-024-06726-5 article Machine Learning 2025-02-19

Cross-Lingual Sentiment Quantification

OPENALEX - Publications

Andrea Esuli Alejandro Moreo Fabrizio Sebastiani

\emph{Sentiment Quantification} (i.e., the task of estimating relative frequency sentiment-related classes -- such as \textsf{Positive} and \textsf{Negative} in a set unlabelled documents) is an important topic sentiment analysis, study quantities trends across population often higher interest than analysis individual instances. In this work we propose method for \emph{Cross-Lingual Sentiment Quantification}, performing quantification when training documents are available source language...

10.1109/mis.2020.2979203 article EN IEEE Intelligent Systems 2020-05-01

Picture it in your mind: generating high level visual representations from textual descriptions

OPENALEX - Publications

Fabio Carrara Andrea Esuli Tiziano Fagni Fabrizio Falchi Alejandro Moreo

10.1007/s10791-017-9318-6 article EN Information Retrieval 2017-10-14

Learning to Weight for Text Classification

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of in document collection order to compute a score reflecting importance for document. tasks characterized by presence training data (such as text classification) it seems logical that function should take into account distribution (as estimated from data) across classes interest. Although "supervised weighting" use this intuition have been described before, they failed show consistent...

10.1109/tkde.2018.2883446 article EN IEEE Transactions on Knowledge and Data Engineering 2018-11-28

Funnelling

OPENALEX - Publications

Andrea Esuli Alejandro Moreo Fabrizio Sebastiani

Cross-lingual Text Classification (CLC) consists of automatically classifying, according to a common set C classes, documents each written in one languages L , and doing so more accurately than when “naïvely” classifying document via its corresponding language-specific classifier. To obtain an increase the classification accuracy for given language, system thus needs also leverage training examples other languages. We tackle “multilabel” CLC funnelling new ensemble learning method that we...

10.1145/3326065 article EN ACM transactions on office information systems 2019-05-31

Binary quantification and dataset shift: an experimental investigation

OPENALEX - Publications

Pablo González Alejandro Moreo Fabrizio Sebastiani

Abstract Quantification is the supervised learning task that consists of training predictors class prevalence values sets unlabelled data, and special interest when labelled data on which predictor has been trained are not IID, i.e., suffer from dataset shift . To date, quantification methods have mostly tested only a case shift, prior probability ; relationship between other types remains, by large, unexplored. In this work we carry out an experimental analysis how current algorithms behave...

10.1007/s10618-024-01014-1 article EN cc-by Data Mining and Knowledge Discovery 2024-03-18

Efficient Evaluation of Image Quality via Deep-Learning Approximation of Perceptual Metrics

OPENALEX - Publications

Alessandro Artusi Francesco Banterle Fabio Carra Alejandro Moreo

Image metrics based on Human Visual System (HVS) play a remarkable role in the evaluation of complex image processing algorithms. However, mimicking HVS is known to be and computationally expensive (both terms time memory), its usage thus limited few applications small input data. All this makes such not fully attractive real-world scenarios. To address these issues, we propose Deep Quality Metric (DIQM), deep-learning approach learn global quality feature (mean-opinion-score). DIQM can...

10.1109/tip.2019.2944079 article EN IEEE Transactions on Image Processing 2019-10-07

QuaPy: A Python-Based Framework for Quantification

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

QuaPy is an open-source framework for performing quantification (a.k.a. supervised prevalence estimation), written in Python. Quantification the task of training quantifiers via learning, where a quantifier predictor that estimates relative frequencies values) classes interest sample unlabelled data. While can be trivially performed by applying standard classifier to each data item and counting how many items have been assigned class, it has shown this "classify count" method outperformed...

10.1145/3459637.3482015 article EN 2021-10-26

A high-performance FAQ retrieval method using minimal differentiator expressions

OPENALEX - Publications

Alejandro Moreo María G. Navarro Juan Luis Castro J.M. Zurita

10.1016/j.knosys.2012.05.015 article EN Knowledge-Based Systems 2012-06-05

Learning regular expressions to template-based FAQ retrieval systems

OPENALEX - Publications

Alejandro Moreo Eduardo M. Eisman Juan Luis Castro J.M. Zurita

10.1016/j.knosys.2013.08.018 article EN Knowledge-Based Systems 2013-08-26

Lost in Transduction: Transductive Transfer Learning in Text Classification

OPENALEX - Publications

Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

Obtaining high-quality labelled data for training a classifier in new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, the “target” of interest, knowledge available from different “source” domain. In transfer learning lack information target compensated availability at time set unlabelled examples distribution. Transductive denotes setting which only documents that we are interested classifying known and time....

10.1145/3453146 article EN ACM Transactions on Knowledge Discovery from Data 2021-07-20

Kernel Density Estimation for Multiclass Quantification

OPENALEX - Publications

Alejandro Moreo Pablo González Juan José del Coz

Several disciplines, like the social sciences, epidemiology, sentiment analysis, or market research, are interested in knowing distribution of classes a population rather than individual labels members thereof. Quantification is supervised machine learning task concerned with obtaining accurate predictors class prevalence, and to do so particularly presence label shift. The distribution-matching (DM) approaches represent one most important families among quantification methods that have been...

10.48550/arxiv.2401.00490 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Report on the 3rd International Workshop onLearning to Quantify (LQ 2023)

OPENALEX - Publications

Mirko Bunse Pablo González Alejandro Moreo Fabrizio Sebastiani

The 3rd International Workshop on Learning to Quantify (LQ 2023)1 took place September 18, 2023 in Torino, IT, where it was organised as a satellite event of the 34th European Conference Machine and Principles Practice Knowledge Discovery Databases (ECML PKDD 2023). Like main program conference, workshop employed hybrid format, with all presentations given presence attendees participating or online. This report presents summary workshop, briefly summarising individual works presented,...

10.1145/3655103.3655108 article EN ACM SIGKDD Explorations Newsletter 2024-03-26

A Recurrent Neural Network for Sentiment Quantification

OPENALEX - Publications

Andrea Esuli Alejandro Moreo Fabrizio Sebastiani

Quantification is a supervised learning task that consists in predicting, given set of classes C and D unlabelled items, the prevalence (or relative frequency) p_c(D) each class c\in\mathcalC D. can principle be solved by classifying all items counting how many them have been attributed to class. However, this "classify count" approach has shown yield suboptimal quantification accuracy; established as its own, rise number methods specifically devised for it. We propose recurrent neural...

10.1145/3269206.3269287 preprint EN 2018-10-17

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

OPENALEX - Publications

Salud María Jiménez-Zafra Giacomo Berardi Andrea Esuli Diego Marcheggiani María Teresa Martín Valdivia and 1 more

Salud M. Jiménez Zafra, Giacomo Berardi, Andrea Esuli, Diego Marcheggiani, María Teresa Martín-Valdivia, Alejandro Moreo Fernández. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.

10.18653/v1/d15-1302 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Coming Soon ...