LTACL: long-tail awareness contrastive learning for distantly supervised relation extraction

Distantly supervised learning Information extraction Electronic computers. Computer science 0202 electrical engineering, electronic engineering, information engineering Relation extraction Contrastive learning QA75.5-76.95 Information technology 02 engineering and technology T58.5-58.64
DOI: 10.1007/s40747-023-01226-w Publication Date: 2023-09-28T02:01:30Z
ABSTRACT
AbstractDistantly supervised relation extraction is an automatically annotating method for large corpora by classifying a bound of sentences with two same entities and the relation. Recent works exploit sound performance by adopting contrastive learning to efficiently obtain instance representations under the multi-instance learning framework. Though these methods weaken the impact of noisy labels, it ignores the long-tail distribution problem in distantly supervised sets and fails to capture the mutual information of different parts. We are thus motivated to tackle these issues and establishing a long-tail awareness contrastive learning method for efficiently utilizing the long-tail data. Our model treats major and tail parts differently by adopting hyper-augmentation strategies. Moreover, the model provides various views by constructing novel positive and negative pairs in contrastive learning for gaining a better representation between different parts. The experimental results on the NYT10 dataset demonstrate our model surpasses the existing SOTA by more than 2.61% AUC score on relation extraction. In manual evaluation datasets including NYT10m and Wiki20m, our method obtains competitive results by achieving 59.42% and 79.19% AUC scores on relation extraction, respectively. Extensive discussions further confirm the effectiveness of our approach.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (47)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....