NFDI4DS | UHH-SEMS - Publication Details

Boosting Deductive Reasoning with Step Signals In RLHF

Boosting

DOI: 10.48550/arxiv.2410.09528 Publication Date: 2024-10-12

Abstract Supplemental Material References Cited by

AUTHORS (6)

Jialian Li

Yipin Zhang

Wei Shen

Y.J. Yan

Jian Xie

Dong Yan

ABSTRACT

Logical reasoning is a crucial task for Large Language Models (LLMs), enabling them to tackle complex problems. Among tasks, multi-step poses particular challenge. Grounded in the theory of formal logic, we have developed an automated method, Multi-step Deduction (MuseD), deductive data. MuseD has allowed us create training and testing datasets reasoning. Our generation method enables control over complexity generated instructions, facilitating evaluation models across different difficulty levels. Through RLHF training, our data demonstrated significant improvements logical capabilities both in-domain out-of-domain tasks. Additionally, conducted tests assess abilities various models.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Boosting Deductive Reasoning with Step Signals In RLHF

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....