NFDI4DS | UHH-SEMS - Publication Details

RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models

Benchmark (surveying)

DOI: 10.48550/arxiv.2405.14486 Publication Date: 2024-05-23

Abstract Supplemental Material References Cited by

AUTHORS (10)

Xiangkun Hu

Dongyu Ru

Lin Qiu

Qipeng Guo

Tianhang Zhang

Yang Xu

Yun Luo

Pengfei Liu

Yue Zhang

Zheng Zhang

ABSTRACT

Large Language Models (LLMs) have shown impressive capabilities but also a concerning tendency to hallucinate. This paper presents RefChecker, framework that introduces claim-triplets represent claims in LLM responses, aiming detect fine-grained hallucinations. In an extractor generates from response, which are then evaluated by checker against reference. We delineate three task settings: Zero, Noisy and Accurate Context, reflect various real-world use cases. curated benchmark spanning NLP tasks annotated 11k 2.1k responses seven LLMs. RefChecker supports both proprietary open-source models as the checker. Experiments demonstrate enable superior hallucination detection, compared other granularities such sentence sub-sentence level claims. outperforms prior methods 6.8 26.1 points on our checking results of strongly aligned with human judgments. work is open sourced at https://github.com/amazon-science/RefChecker

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....