Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation

FOS: Computer and information sciences Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computer Science - Digital Libraries Digital Libraries (cs.DL) Computation and Language (cs.CL)
DOI: 10.48550/arxiv.2502.07352 Publication Date: 2025-02-11
ABSTRACT
This study presents a framework for automated evaluation of dynamically evolving topic taxonomies in scientific literature using Large Language Models (LLMs). In digital library systems, modeling plays crucial role efficiently organizing and retrieving scholarly content, guiding researchers through complex knowledge landscapes. As research domains proliferate shift, traditional human centric static methods struggle to maintain relevance. The proposed approach harnesses LLMs measure key quality dimensions, such as coherence, repetitiveness, diversity, topic-document alignment, without heavy reliance on expert annotators or narrow statistical metrics. Tailored prompts guide LLM assessments, ensuring consistent interpretable evaluations across various datasets techniques. Experiments benchmark corpora demonstrate the method's robustness, scalability, adaptability, underscoring its value more holistic dynamic alternative conventional strategies.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....