NFDI4DS | UHH-SEMS - Publication Details

Benchmarking LLMs for Political Science: A United Nations Perspective

Benchmarking

DOI: 10.48550/arxiv.2502.14122 Publication Date: 2025-02-19

Abstract Supplemental Material References Cited by

AUTHORS (9)

Yueqing Liang

Liangwei Yang

Wang Chen

C D Xia

Rui Meng

Xiongxiao Xu

Haoran Wang

Ali Payani

Kai Shu

ABSTRACT

Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on application of LLMs to United Nations (UN) process, where stakes are particularly high and decisions can far-reaching consequences. We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 2024, including draft resolutions, voting records, diplomatic speeches. Using this dataset, we propose Benchmark (UNBench), first comprehensive benchmark designed evaluate across four interconnected science tasks: co-penholder judgment, representative simulation, adoption prediction, statement generation. These tasks span three stages process--drafting, voting, discussing--and aim assess LLMs' ability understand simulate dynamics. Our experimental analysis demonstrates challenges applying domain, providing insights into strengths limitations science. work contributes growing intersection AI science, opening new avenues research practical applications global governance. The UNBench Repository be accessed at: https://github.com/yueqingliang1/UNBench.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Benchmarking LLMs for Political Science: A United Nations Perspective

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....