Adapting Large Language Models via Reading Comprehension

FOS: Computer and information sciences Computer Science - Computation and Language Computation and Language (cs.CL)
DOI: 10.48550/arxiv.2309.09530 Publication Date: 2023-01-01
ABSTRACT
We explore how continued pre-training on domain-specific corpora influences large language models, revealing that training the raw endows model with domain knowledge, but drastically hurts its prompting ability for question answering. Taken inspiration from human learning via reading comprehension--practice after improves to answer questions based learned knowledge--we propose a simple method transforming into comprehension texts. Each text is enriched series of tasks related content. Our method, highly scalable and applicable any corpora, consistently enhances performance across various in three different domains: biomedicine, finance, law. Notably, our 7B achieves competitive models much larger scales, such as BloombergGPT-50B. Furthermore, we demonstrate texts can improve model's even general benchmarks, showing potential develop more domains. model, code, data are available at https://github.com/microsoft/LMOps.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....