SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Robustness
DOI:
10.48550/arxiv.2105.15203
Publication Date:
2021-01-01
AUTHORS (6)
ABSTRACT
We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders. SegFormer has two appealing features: 1) comprises novel hierarchically structured Transformer encoder outputs multiscale features. It does not need positional encoding, thereby avoiding the interpolation of codes leads to decreased performance when testing resolution differs from training. 2) avoids complex The proposed MLP decoder aggregates information different layers, and thus combining both local attention global render representations. show that this simple design is key on Transformers. scale our approach up obtain series models SegFormer-B0 SegFormer-B5, reaching significantly better efficiency than previous counterparts. For example, SegFormer-B4 achieves 50.3% mIoU ADE20K 64M parameters, being 5x smaller 2.2% best method. Our model, 84.0% Cityscapes validation set shows excellent zero-shot robustness Cityscapes-C. Code will be released at: github.com/NVlabs/SegFormer.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....