NFDI4DS | UHH-SEMS - Publication Details

Enhancing Token Filtering Efficiency in Large Language Model Training with Collider

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Computer Science - Distributed, Parallel, and Cluster Computing Distributed, Parallel, and Cluster Computing (cs.DC) Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2502.00340 Publication Date: 2025-02-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Di Chai

Fuli Li

Feiyuan Zhang

Yilun Jin

Tian Han

Junxue Zhang

Kai Chen

ABSTRACT

Token filtering has been proposed to enhance utility of large language models (LLMs) by eliminating inconsequential tokens during training. While using fewer should reduce computational workloads, existing studies have not succeeded in achieving higher efficiency. This is primarily due the insufficient sparsity caused only output layers, as well inefficient sparse GEMM (General Matrix Multiplication), even when having sufficient sparsity. paper presents Collider, a system unleashing full efficiency token LLM At its core, Collider filters activations across all layers maintain Additionally, it features an automatic workflow that transforms into dimension-reduced dense for optimized Evaluations on three LLMs-TinyLlama-1.1B, Qwen2.5-1.5B, and Phi1.5-1.4B-demonstrate reduces backpropagation time up 35.1% end-to-end training 22.0% 40% tokens. Utility assessments TinyLlama 15B indicate sustains advancements relatively improving model 16.3% comparing regular training, from 4.7 days 3.5 8 GPUs. designed easy integration frameworks, allowing systems already accelerate with just one line code.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Enhancing Token Filtering Efficiency in Large Language Model Training with Collider

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....