ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
FOS: Computer and information sciences
Computer Science - Computation and Language
Computation and Language (cs.CL)
DOI:
10.48550/arxiv.2304.14827
Publication Date:
2023-01-01
AUTHORS (7)
ABSTRACT
This paper aims to quantitatively evaluate the performance of ChatGPT, an interactive large language model, on inter-sentential relations such as temporal relations, causal and discourse relations. Given ChatGPT's promising across various tasks, we proceed carry out thorough evaluations whole test sets 11 datasets, including PDTB2.0-based, dialogue-based To ensure reliability our findings, employ three tailored prompt templates for each task, zero-shot template, engineering (PE) in-context learning (ICL) establish initial baseline scores all popular sentence-pair relation classification tasks first time. Through study, discover that ChatGPT exhibits exceptional proficiency in detecting reasoning about albeit it may not possess same level expertise identifying order between two events. While is capable majority with existing explicit connectives, implicit remains a formidable challenge. Concurrently, demonstrates subpar dialogue parsing task requires structural understanding before being aware relation.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....