NFDI4DS | UHH-SEMS - Publication Details

Pitfalls of Conversational LLMs on News Debiasing

FOS: Computer and information sciences Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2404.06488 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (4)

Baris Schlicht, Ipek

Altiok, Defne

Taouk, Maryanne

Flek, Lucie

ABSTRACT

This paper addresses debiasing in news editing and evaluates the effectiveness of conversational Large Language Models in this task. We designed an evaluation checklist tailored to news editors' perspectives, obtained generated texts from three popular conversational models using a subset of a publicly available dataset in media bias, and evaluated the texts according to the designed checklist. Furthermore, we examined the models as evaluator for checking the quality of debiased model outputs. Our findings indicate that none of the LLMs are perfect in debiasing. Notably, some models, including ChatGPT, introduced unnecessary changes that may impact the author's style and create misinformation. Lastly, we show that the models do not perform as proficiently as domain experts in evaluating the quality of debiased outputs.<br/>The paper is accepted at the DELITE workshop which is co-located at COLING/LREC<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Pitfalls of Conversational LLMs on News Debiasing

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....