NFDI4DS | UHH-SEMS - Publication Details

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2502.07490 Publication Date: 2025-02-11

Abstract Supplemental Material References Cited by

AUTHORS (7)

Xuewei Zhuang

Zhikai Jia

Jian‐Jin Li

Zhenyu Zhang

Li Shen

Zheng Cao

Shiwei Liu

ABSTRACT

Large Language Models (LLMs) are discovered to suffer from accurately retrieving key information. To address this, we propose Mask-Enhanced Autoregressive Prediction (MEAP), a simple yet effective training paradigm that seamlessly integrates Masked Modeling (MLM) into Next-Token (NTP) enhance the latter's in-context retrieval capabilities. Specifically, MEAP first randomly masks small fraction of input tokens and then directly performs standard next-token prediction autoregressive using decoder-only Transformer. eliminates need for bidirectional attention or encoder-decoder architectures MLM, incurring no additional computational overhead during pre-training inference. Intensive experiments demonstrate substantially outperforms NTP on information long-context reasoning tasks, while performing par better commonsense tasks. The benefits also extend supervised fine-tuning, where it shows remarkable advantages in lost-in-the-middle scenarios, outperforming by 11.77 percentage points. Our analysis indicates MEAP's effectiveness arises its ability promote more distinguishable scores concentrating reduced set non-masked tokens. This mechanism improves model's focus task-relevant signals mitigating influence peripheral context. These findings position as promising large language models.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....