CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation
FOS: Computer and information sciences
Computer Vision and Pattern Recognition (cs.CV)
Computer Science - Computer Vision and Pattern Recognition
DOI:
10.48550/arxiv.2405.10530
Publication Date:
2024-05-17
AUTHORS (6)
ABSTRACT
Due to the large-scale image size and object variations, current CNN-based Transformer-based approaches for remote sensing semantic segmentation are suboptimal capturing long-range dependency or limited complex computational complexity. In this paper, we propose CM-UNet, comprising a encoder extracting local features Mamba-based decoder aggregating integrating global information, facilitating efficient of images. Specifically, CSMamba block is introduced build core decoder, which employs channel spatial attention as gate activation condition vanilla Mamba enhance feature interaction global-local information fusion. Moreover, further refine output from CNN encoder, Multi-Scale Attention Aggregation (MSAA) module employed merge different scale features. By MSAA module, CM-UNet effectively captures dependencies multi-scale contextual remote-sensing Experimental results obtained on three benchmarks indicate that proposed outperforms existing methods in various performance metrics. The codes available at https://github.com/XiaoBuL/CM-UNet.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....