Niccolò Campolungo

ORCID: 0000-0002-2389-8242
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Text Readability and Simplification
  • Semantic Web and Ontologies
  • Multimodal Machine Learning Applications
  • Wikis in Education and Collaboration
  • Information Retrieval and Search Behavior
  • Speech and dialogue systems

Sapienza University of Rome
2020-2023

Multilingual Named Entity Recognition (NER) is a key intermediate task which needed in many areas of NLP. In this paper, we address the well-known issue data scarcity NER, especially relevant when moving to multilingual scenario, and go beyond current approaches creation silver for task. We exploit texts Wikipedia introduce new methodology based on effective combination knowledge-based neural models, together with novel domain adaptation technique, produce high-quality training corpora NER....

10.18653/v1/2021.findings-emnlp.215 article EN cc-by 2021-01-01

The knowledge acquisition bottleneck strongly affects the creation of multilingual sense-annotated data, hence limiting power supervised systems when applied to Word Sense Disambiguation. In this paper, we propose a semi-supervised approach based upon novel label propagation scheme, which, by jointly leveraging contextualized word embeddings and information enclosed in base, projects sense labels from high-resource language, i.e., English, lower-resourced ones. Backed several experiments,...

10.24963/ijcai.2020/531 article EN 2020-07-01

Lexical ambiguity poses one of the greatest challenges in field Machine Translation. Over last few decades, multiple efforts have been undertaken to investigate incorrect translations caused by polysemous nature words. Within this body research, some studies posited that models pick up semantic biases existing training data, thus producing translation errors. In paper, we present DiBiMT, first entirely manually-curated evaluation benchmark which enables an extensive study Translation nominal...

10.18653/v1/2022.acl-long.298 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

With the advent of contextualized embeddings, attention towards neural ranking approaches for Information Retrieval increased considerably. However, two aspects have remained largely neglected: i) queries usually consist few keywords only, which increases ambiguity and makes their contextualization harder, ii) performing on non-English documents is still cumbersome due to shortage labeled datasets. In this paper we present SIR (Sense-enhanced Retrieval) mitigate both problems by leveraging...

10.18653/v1/2021.emnlp-main.79 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Niccolò Campolungo, Tommaso Pasini, Denis Emelin, Roberto Navigli. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.355 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Abstract Despite the remarkable progress made in field of Machine Translation (MT), current systems still struggle when translating ambiguous words, especially these express infrequent meanings. In order to investigate and analyze impact lexical ambiguity on automatic translations, several tasks evaluation benchmarks have been proposed over course last few years. However, works this research direction suffer from critical shortcomings. Indeed, existing datasets are not entirely manually...

10.1162/coli_a_00541 article EN cc-by-nc-nd Computational Linguistics 2024-09-27

Over the last few years, Masked Language Modeling (MLM) pre-training has resulted in remarkable advancements many Natural Understanding (NLU) tasks, which sparked an interest researching alternatives and extensions to MLM objective. In this paper, we tackle absence of explicit semantic grounding propose Descriptive (DMLM), a knowledge-enhanced reading comprehension objective, where model is required predict most likely word context, being provided with word’s definition. For instance, given...

10.18653/v1/2023.findings-acl.808 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01
Coming Soon ...