LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

Leverage (statistics) Pruning
DOI: 10.18653/v1/2020.coling-main.287 Publication Date: 2021-01-08T13:58:31Z
ABSTRACT
BERT is a cutting-edge language representation model pre-trained by large corpus, which achieves superior performances on various natural understanding tasks. However, major blocking issue of applying to online services that it memory-intensive and leads unsatisfactory latency user requests, raising the necessity compression. Existing solutions leverage knowledge distillation framework learn smaller imitates behaviors BERT. training procedure expensive itself as requires sufficient data imitate teacher model. In this paper, we address proposing tailored solution named LadaBERT (Lightweight adaptation through hybrid compression), combines advantages different compression methods, including weight pruning, matrix factorization distillation. state-of-the-art accuracy public datasets while overheads can be reduced an order magnitude.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (9)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....