TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection
Benchmark (surveying)
Anomaly (physics)
DOI:
10.48550/arxiv.2501.11960
Publication Date:
2025-01-21
AUTHORS (8)
ABSTRACT
Text anomaly detection is crucial for identifying spam, misinformation, and offensive language in natural processing tasks. Despite the growing adoption of embedding-based methods, their effectiveness generalizability across diverse application scenarios remain under-explored. To address this, we present TAD-Bench, a comprehensive benchmark designed to systematically evaluate approaches text detection. TAD-Bench integrates multiple datasets spanning different domains, combining state-of-the-art embeddings from large models with variety algorithms. Through extensive experiments, analyze interplay between uncovering strengths, weaknesses, applicability These findings offer new perspectives on building more robust, efficient, generalizable systems real-world applications.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....