Token Adaptation via Side Graph Convolution for Temporally and Spatially Efficient Fine-tuning of 3D Point Cloud Transformers
FOS: Computer and information sciences
Computer Vision and Pattern Recognition (cs.CV)
Computer Science - Computer Vision and Pattern Recognition
DOI:
10.48550/arxiv.2502.14142
Publication Date:
2025-02-19
AUTHORS (1)
ABSTRACT
Parameter-efficient fine-tuning (PEFT) of pre-trained 3D point cloud Transformers has emerged as a promising technique for analysis. While existing PEFT methods attempt to minimize the number tunable parameters, they still suffer from high temporal and spatial computational costs during fine-tuning. This paper proposes novel algorithm Transformers, called Side Token Adaptation on neighborhood Graph (STAG), achieve superior efficiency. STAG employs graph convolutional side network that operates in parallel with frozen backbone Transformer adapt tokens downstream tasks. STAG's realizes efficiency through three key components: connection enables reduced gradient computation, parameter sharing framework, efficient convolution. Furthermore, we present Point Cloud Classification 13 (PCC13), new benchmark comprising diverse publicly available datasets, enabling comprehensive evaluation methods. Extensive experiments using multiple models PCC13 demonstrates effectiveness STAG. Specifically, maintains classification accuracy comparable while reducing parameters only 0.43M achieving significant reductions both time memory consumption Code will be at: https://github.com/takahikof/STAG
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....