NFDI4DS | UHH-SEMS - Publication Details

Niccolò Campolungo

ORCID: 0000-0002-2389-8242

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5030725591

Research Areas

Topic Modeling
Natural Language Processing Techniques
Text Readability and Simplification
Semantic Web and Ontologies
Multimodal Machine Learning Applications
Wikis in Education and Collaboration
Information Retrieval and Search Behavior
Speech and dialogue systems

Sapienza University of Rome
2020-2023

WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER

OPENALEX - Publications

Simone Tedeschi Valentino Maiorca Niccolò Campolungo Francesco Cecconi Roberto Navigli

Multilingual Named Entity Recognition (NER) is a key intermediate task which needed in many areas of NLP. In this paper, we address the well-known issue data scarcity NER, especially relevant when moving to multilingual scenario, and go beyond current approaches creation silver for task. We exploit texts Wikipedia introduce new methodology based on effective combination knowledge-based neural models, together with novel domain adaptation technique, produce high-quality training corpora NER....

10.18653/v1/2021.findings-emnlp.215 article EN cc-by 2021-01-01

MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation

OPENALEX - Publications

Edoardo Barba Luigi Procopio Niccolò Campolungo Tommaso Pasini Roberto Navigli

The knowledge acquisition bottleneck strongly affects the creation of multilingual sense-annotated data, hence limiting power supervised systems when applied to Word Sense Disambiguation. In this paper, we propose a semi-supervised approach based upon novel label propagation scheme, which, by jointly leveraging contextualized word embeddings and information enclosed in base, projects sense labels from high-resource language, i.e., English, lower-resourced ones. Backed several experiments,...

10.24963/ijcai.2020/531 article EN 2020-07-01

DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation

OPENALEX - Publications

Niccolò Campolungo Federico Martelli Francesco Saina Roberto Navigli

Lexical ambiguity poses one of the greatest challenges in field Machine Translation. Over last few decades, multiple efforts have been undertaken to investigate incorrect translations caused by polysemous nature words. Within this body research, some studies posited that models pick up semantic biases existing training data, thus producing translation errors. In paper, we present DiBiMT, first entirely manually-curated evaluation benchmark which enables an extensive study Translation nominal...

10.18653/v1/2022.acl-long.298 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

IR like a SIR: Sense-enhanced Information Retrieval for Multiple Languages

OPENALEX - Publications

Rexhina Blloshmi Tommaso Pasini Niccolò Campolungo Somnath Banerjee Roberto Navigli and 1 more

With the advent of contextualized embeddings, attention towards neural ranking approaches for Information Retrieval increased considerably. However, two aspects have remained largely neglected: i) queries usually consist few keywords only, which increases ambiguity and makes their contextualization harder, ii) performing on non-English documents is still cumbersome due to shortage labeled datasets. In this paper we present SIR (Sense-enhanced Retrieval) mitigate both problems by leveraging...

10.18653/v1/2021.emnlp-main.79 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information

OPENALEX - Publications

Niccolò Campolungo Tommaso Pasini Denis Emelin Roberto Navigli

Niccolò Campolungo, Tommaso Pasini, Denis Emelin, Roberto Navigli. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.355 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

DiBiMT: A Gold Evaluation Benchmark for Studying Lexical Ambiguity in Machine Translation

OPENALEX - Publications

Federico Martelli Stefano Parrella Niccolò Campolungo Tina Munda Svetla Koeva and 2 more

Abstract Despite the remarkable progress made in field of Machine Translation (MT), current systems still struggle when translating ambiguous words, especially these express infrequent meanings. In order to investigate and analyze impact lexical ambiguity on automatic translations, several tasks evaluation benchmarks have been proposed over course last few years. However, works this research direction suffer from critical shortcomings. Indeed, existing datasets are not entirely manually...

10.1162/coli_a_00541 article EN cc-by-nc-nd Computational Linguistics 2024-09-27

DMLM: Descriptive Masked Language Modeling

OPENALEX - Publications

Edoardo Barba Niccolò Campolungo Roberto Navigli

Over the last few years, Masked Language Modeling (MLM) pre-training has resulted in remarkable advancements many Natural Understanding (NLU) tasks, which sparked an interest researching alternatives and extensions to MLM objective. In this paper, we tackle absence of explicit semantic grounding propose Descriptive (DMLM), a knowledge-enhanced reading comprehension objective, where model is required predict most likely word context, being provided with word’s definition. For instance, given...

10.18653/v1/2023.findings-acl.808 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Coming Soon ...