NFDI4DS | UHH-SEMS - Publication Details

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining

Benchmark (surveying)

DOI: 10.48550/arxiv.2504.02107 Publication Date: 2025-04-02

Abstract Supplemental Material References Cited by

AUTHORS (11)

Jeffrey Li

Mohammadreza Arma...

Iman Mirzadeh

Sachin Mehta

Vaishaal Shankar

Raviteja Vemulapalli

Samy Bengio

Oncel Tuzel

Mehrdad Farajtabar

Hadi Pouransari

Fartash Faghri

ABSTRACT

Large Language Models (LLMs) trained on historical web data inevitably become outdated. We investigate evaluation strategies and update methods for LLMs as new becomes available. introduce a web-scale dataset time-continual pretraining of derived from 114 dumps Common Crawl (CC) - orders magnitude larger than previous continual language modeling benchmarks. also design time-stratified evaluations across both general CC specific domains (Wikipedia, StackExchange, code documentation) to assess how well various learning adapt while retaining past knowledge. Our findings demonstrate that, data, autoregressive meta-schedules combined with fixed-ratio replay older can achieve comparable held-out loss re-training scratch, requiring significantly less computation (2.6x). However, the optimal balance between incorporating replaying old differs is crucial avoid forgetting generic but so domains.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....