NFDI4DS | UHH-SEMS - Publication Details

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

Performance Improvement

DOI: 10.48550/arxiv.2210.06707 Publication Date: 2022-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Yanjing Li

Sheng Xu

Baochang Zhang

Xianbin Cao

Peng Gao

Guodong Guo

ABSTRACT

The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed resource-constrained devices. Among the powerful compression approaches, quantization extremely reduces computation consumption by low-bit parameters bit-wise operations. However, ViTs remain largely unexplored usually a significant drop compared with real-valued counterparts. In this work, through extensive empirical analysis, we first identify bottleneck for severe comes information distortion of quantized self-attention map. We then develop an rectification module (IRM) distribution guided distillation (DGD) scheme fully (Q-ViT) to effectively eliminate such distortion, leading ViTs. evaluate our methods popular DeiT Swin backbones. Extensive experimental results show that method achieves much better than prior arts. For example, Q-ViT can theoretically accelerates ViT-S 6.14x about 80.9% Top-1 accuracy, even surpassing full-precision counterpart 1.0% ImageNet dataset. Our codes models are attached https://github.com/YanjingLi0202/Q-ViT

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....