MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining

Foundation (evidence)
DOI: 10.48550/arxiv.2403.13430 Publication Date: 2024-03-20
ABSTRACT
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks. Pretraining is an active research topic, encompassing supervised and self-supervised learning methods to initialize model weights effectively. However, transferring pretrained downstream tasks may encounter task discrepancy due their formulation pretraining as classification or object discrimination In this study, we explore Multi-Task (MTP) paradigm for RS foundation address issue. Using a shared encoder task-specific decoder architecture, conduct multi-task on SAMRS dataset, semantic segmentation, instance rotated detection. MTP supports both convolutional neural networks vision transformer with over 300 million parameters. The are finetuned tasks, such scene classification, horizontal detection, change Extensive experiments across 14 datasets demonstrate superiority our existing ones similar size competitive performance compared larger state-of-the-art models, thus validating effectiveness MTP.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....