Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Computation and Language
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Computation and Language (cs.CL)
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.2407.20311
Publication Date:
2024-07-29
AUTHORS (4)
ABSTRACT
Recent advances in language models have demonstrated their capability to solve mathematical reasoning problems, achieving near-perfect accuracy on grade-school level math benchmarks like GSM8K. In this paper, we formally study how these problems. We design a series of controlled experiments address several fundamental questions: (1) Can truly develop skills, or do they simply memorize templates? (2) What is the model's hidden (mental) process? (3) Do questions using skills similar different from humans? (4) trained GSM8K-like datasets beyond those necessary for solving GSM8K problems? (5) mental process causes make mistakes? (6) How large deep must model be effectively GSM8K-level questions? Our uncovers many mechanisms by which questions, providing insights that extend current understandings LLMs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....