Encoder–Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation

Depth map Upsampling RGB color model Ground truth Fusion mechanism Pyramid (geometry)
DOI: 10.3390/app13179924 Publication Date: 2023-09-04T06:59:55Z
ABSTRACT
The semantic segmentation of outdoor images is the cornerstone scene understanding and plays a crucial role in autonomous navigation robots. Although RGB–D can provide additional depth information for improving performance tasks, current state–of–the–art methods directly use ground truth maps fusion, which relies on highly developed expensive sensors. Aiming to solve such problem, we proposed self–calibrated RGB-D image neural network model based an improved residual without relying sensors, utilizes multi-modal from predicted with estimation models RGB fusion enhance scene. First, designed novel convolution (CNN) encoding decoding structure as our model. encoder was constructed using IResNet extract features map then effectively fuse them self–calibration structure. decoder restored resolution output series successive upsampling structures. Second, presented feature pyramid attention mechanism fused at multiple scales obtain rich information. experimental results publicly available Cityscapes dataset collected forest show that trained estimated achieve comparable accuracy task even outperforming some competitive methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (43)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....