Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Speedup Softmax function Operator (biology) Normalization Performance Improvement Generative model
DOI: 10.48550/arxiv.2401.06197 Publication Date: 2024-01-01
ABSTRACT
We introduce Deformable Convolution v4 (DCNv4), a highly efficient and effective operator designed for broad spectrum of vision applications. DCNv4 addresses the limitations its predecessor, DCNv3, with two key enhancements: 1. removing softmax normalization in spatial aggregation to enhance dynamic property expressive power 2. optimizing memory access minimize redundant operations speedup. These improvements result significantly faster convergence compared DCNv3 substantial increase processing speed, achieving more than three times forward speed. demonstrates exceptional performance across various tasks, including image classification, instance semantic segmentation, notably, generation. When integrated into generative models like U-Net latent diffusion model, outperforms baseline, underscoring possibility models. In practical applications, replacing InternImage model create FlashInternImage results up 80% speed further improvement without modifications. The advancements efficiency DCNv4, combined robust diverse show potential as foundational building block future
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....