Measuring the Inconsistency of Large Language Models in Preferential Ranking

FOS: Computer and information sciences Computer Science - Computation and Language Computation and Language (cs.CL)
DOI: 10.18653/v1/2024.knowllm-1.14 Publication Date: 2024-09-20T19:33:08Z
ABSTRACT
Despite large language models' (LLMs) recent advancements, their bias and hallucination issues persist, ability to offer consistent preferential rankings remains underexplored. This study investigates the capacity of LLMs provide ordinal preferences, a crucial aspect in scenarios with dense decision space or lacking absolute answers. We introduce formalization consistency based on order theory, outlining criteria such as transitivity, asymmetry, reversibility, independence from irrelevant alternatives. Our diagnostic experiments selected state-of-the-art reveal inability meet these criteria, indicating strong positional poor preferences easily swayed by These findings highlight significant inconsistency LLM-generated rankings, underscoring need for further research address limitations.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....