Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
Hardware acceleration
DOI:
10.48550/arxiv.2405.00314
Publication Date:
2024-05-01
AUTHORS (3)
ABSTRACT
Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a promising alternative to convolutional neural networks (CNNs) in several vision-related applications. However, their large model sizes and high computational memory demands hinder deployment, especially on resource-constrained devices. This underscores the necessity of algorithm-hardware co-design specific ViTs, aiming optimize performance by tailoring both algorithmic structure underlying hardware accelerator each other's strengths. Model quantization, converting high-precision numbers lower-precision, reduces needs allowing creation specifically optimized for these quantized algorithms, boosting efficiency. article provides comprehensive survey ViTs quantization its acceleration. We first delve into unique architectural attributes runtime characteristics. Subsequently, we examine fundamental principles followed comparative analysis state-of-the-art techniques ViTs. Additionally, explore acceleration highlighting importance hardware-friendly algorithm design. In conclusion, this will discuss ongoing challenges future research paths. consistently maintain related open-source materials at https://github.com/DD-DuDa/awesome-vit-quantization-acceleration.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....