NFDI4DS | UHH-SEMS - Publication Details

TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection

Benchmark (surveying) Anomaly (physics)

DOI: 10.48550/arxiv.2501.11960 Publication Date: 2025-01-21

Abstract Supplemental Material References Cited by

AUTHORS (8)

Yang Cao

Sikun Yang

Chen Li

Haolong Xiang

Lianyong Qi

Бо Лю

Rongsheng Li

Ming Liu

ABSTRACT

Text anomaly detection is crucial for identifying spam, misinformation, and offensive language in natural processing tasks. Despite the growing adoption of embedding-based methods, their effectiveness generalizability across diverse application scenarios remain under-explored. To address this, we present TAD-Bench, a comprehensive benchmark designed to systematically evaluate approaches text detection. TAD-Bench integrates multiple datasets spanning different domains, combining state-of-the-art embeddings from large models with variety algorithms. Through extensive experiments, analyze interplay between uncovering strengths, weaknesses, applicability These findings offer new perspectives on building more robust, efficient, generalizable systems real-world applications.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....