Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction
Benchmark (surveying)
Code (set theory)
DOI:
10.48550/arxiv.2205.05889
Publication Date:
2022-01-01
AUTHORS (9)
ABSTRACT
Entity matching (EM) is the most critical step for entity resolution (ER). While current deep learningbased methods achieve very impressive performance on standard EM benchmarks, their realworld application much frustrating. In this paper, we highlight that such gap between reality and ideality stems from unreasonable benchmark construction process, which inconsistent with nature of therefore leads to biased evaluations approaches. To end, build a new corpus re-construct benchmarks challenge assumptions implicit in previous process by step-wisely changing restricted entities, balanced labels, single-modal records into open imbalanced multimodal an environment. Experimental results demonstrate made are not coincidental environment, conceal main challenges task significantly overestimate progress matching. The constructed code publicly released
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....