MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER
Named Entity Recognition
Entity linking
DOI:
10.18653/v1/2022.acl-long.160
Publication Date:
2022-06-03T01:34:53Z
AUTHORS (7)
ABSTRACT
Data augmentation is an effective solution to data scarcity in low-resource scenarios. However, when applied token-level tasks such as NER, methods often suffer from token-label misalignment, which leads unsatsifactory performance. In this work, we propose Masked Entity Language Modeling (MELM) a novel framework for NER. To alleviate the misalignment issue, explicitly inject NER labels into sentence context, and thus fine-tuned MELM able predict masked entity tokens by conditioning on their labels. Thereby, generates high-quality augmented with entities, provides rich regularity knowledge boosts When training multiple languages are available, also integrate code-mixing further improvement. We demonstrate effectiveness of monolingual, cross-lingual multilingual across various levels. Experimental results show that our consistently outperforms baseline methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (42)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....