NFDI4DS | UHH-SEMS - Publication Details

Decoding AI Judgment: How LLMs Assess News Credibility and Bias

FOS: Computer and information sciences Computer Science - Computers and Society Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computers and Society (cs.CY) Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2502.04426 Publication Date: 2025-02-06

Abstract Supplemental Material References Cited by

AUTHORS (5)

Edoardo Loru

Jacopo Nudo

Niccolò Di Marco

Matteo Cinelli

Walter Quattrocio...

ABSTRACT

Large Language Models (LLMs) are increasingly used to assess news credibility, yet little is known about how they make these judgments. While prior research has examined political bias in LLM outputs or their potential for automated fact-checking, internal evaluation processes remain largely unexamined. Understanding LLMs credibility provides insights into AI behavior and structured applied large-scale language models. This study benchmarks the reliability classifications of state-of-the-art - Gemini 1.5 Flash (Google), GPT-4o mini (OpenAI), LLaMA 3.1 (Meta) against structured, expert-driven rating systems such as NewsGuard Media Bias Fact Check. Beyond assessing classification performance, we analyze linguistic markers that shape decisions, identifying which words concepts drive evaluations. We uncover patterns associate with specific features by examining keyword frequency, contextual determinants, rank distributions. static classification, introduce a framework refine assessments retrieving external information, querying other models, adapting responses. allows us investigate whether reflect reasoning rely primarily on learned associations.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Decoding AI Judgment: How LLMs Assess News Credibility and Bias

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....