Adaptive Offloading of Transformer Inference for Weak Edge Devices with Masked Autoencoders
Edge device
Autoencoder
DOI:
10.1145/3639824
Publication Date:
2024-01-13T13:36:51Z
AUTHORS (5)
ABSTRACT
Transformer is a popular machine learning model used by many intelligent applications in smart cities. However, it has high computational complexity and would be hard to deploy weak-edge devices. This paper presents novel two-round offloading scheme, called A-MOT, for efficient transformer inference. A-MOT only samples small part of image data sends edge servers, with negligible overhead at The recovered the server masked autoencoder (MAE) before In addition, an SLO-adaptive module intended achieve personalized transmission effective bandwidth utilization. To avoid large on repeat inference second round, further contains lightweight save time round. Extensive experiments have been conducted verify effectiveness A-MOT.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (52)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....