NFDI4DS | UHH-SEMS - Publication Details

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

Robustness Benchmark (surveying)

DOI: 10.1609/aaai.v35i14.17518 Publication Date: 2022-09-08T19:56:51Z

Abstract Supplemental Material References Cited by

AUTHORS (7)

Hao Fu

Shaojun Zhou

Qihong Yang

Junjie Tang

Guiquan Liu

Kaikui Liu

Xiaolong Li

ABSTRACT

The pre-training models such as BERT have achieved great results in various natural language processing problems. However, a large number of parameters need significant amounts memory and the consumption inference time, which makes it difficult to deploy them on edge devices. In this work, we propose knowledge distillation method LRC-BERT based contrastive learning fit output intermediate layer from angular distance aspect, is not considered by existing methods. Furthermore, introduce gradient perturbation-based training architecture phase increase robustness LRC-BERT, first attempt distillation. Additionally, order better capture distribution characteristics layer, design two-stage for total loss. Finally, verifying 8 datasets General Language Understanding Evaluation (GLUE) benchmark, performance proposed exceeds state-of-the-art methods, proves effectiveness our method.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (31)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....