NFDI4DS | UHH-SEMS - Publication Details

Simple and Effective Masked Diffusion Language Models

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2406.07524 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (8)

Sahoo, Subham Sekhar

Arriola, Marianne

Schiff, Yair

Gokaslan, Aaron

Marroquin, Edgar

Chiu, Justin T

Rush, Alexander

Kuleshov, Volodymyr

ABSTRACT

NeurIPS 2024. We provide the code at https://github.com/kuleshov-group/mdlm<br/>While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a simplified, Rao-Blackwellized objective that results in additional improvements. Our objective has a simple form -- it is a mixture of classical masked language modeling losses -- and can be used to train encoder-only language models that admit efficient samplers, including ones that can generate arbitrary lengths of text semi-autoregressively like a traditional language model. On language modeling benchmarks, a range of masked diffusion models trained with modern engineering practices achieves a new state-of-the-art among diffusion models, and approaches AR perplexity. We provide the code, along with a blog post and video tutorial on the project page: https://s-sahoo.com/mdlm<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Simple and Effective Masked Diffusion Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....