SwinCross: Cross‐modal Swin transformer for head‐and‐neck tumor segmentation in PET/CT images
FOS: Computer and information sciences
Computer Vision and Pattern Recognition (cs.CV)
Image and Video Processing (eess.IV)
Computer Science - Computer Vision and Pattern Recognition
Electrical Engineering and Systems Science - Image and Video Processing
3. Good health
03 medical and health sciences
0302 clinical medicine
Head and Neck Neoplasms
Positron Emission Tomography Computed Tomography
Positron-Emission Tomography
Image Processing, Computer-Assisted
FOS: Electrical engineering, electronic engineering, information engineering
Humans
Learning
Neural Networks, Computer
DOI:
10.1002/mp.16703
Publication Date:
2023-09-30T12:20:25Z
AUTHORS (5)
ABSTRACT
Abstract Background Radiotherapy (RT) combined with cetuximab is the standard treatment for patients inoperable head and neck cancers. Segmentation of (H&N) tumors a prerequisite radiotherapy planning but time‐consuming process. In recent years, deep convolutional neural networks (DCNN) have become de facto automated image segmentation. However, due to expensive computational cost associated enlarging field view in DCNNs, their ability model long‐range dependency still limited, this can result sub‐optimal segmentation performance objects background context spanning over long distances. On other hand, Transformer models demonstrated excellent capabilities capturing such information several semantic tasks performed on medical images. Purpose Despite impressive representation capacity vision transformer models, current transformer‐based suffer from inconsistent incorrect dense predictions when fed multi‐modal input data. We suspect that power self‐attention mechanism may be limited extracting complementary exists To end, we propose novel model, debuted, Cross‐modal Swin (SwinCross), cross‐modal attention (CMA) module incorporate feature extraction at multiple resolutions. Methods architecture 3D two main components: (1) integrating modalities (PET CT), (2) shifted window block learning modalities. evaluate efficacy our approach, conducted experiments ablation studies HECKTOR 2021 challenge dataset. compared method against nnU‐Net (the backbone top‐5 methods 2021) state‐of‐the‐art including UNETR UNETR. The employed five‐fold cross‐validation setup using PET CT Results Empirical evidence demonstrates proposed consistently outperforms comparative techniques. This success attributed CMA module's enhance inter‐modality representations between during head‐and‐neck tumor Notably, SwinCross surpasses across all five folds, showcasing its proficiency varying resolutions through modules. Conclusions introduced automating delineation Our incorporates cross‐modality module, enabling exchange features experimental results establish superiority improved correlations Furthermore, methodology holds applicability involving different imaging like SPECT/CT or PET/MRI. Code: https://github.com/yli192/SwinCross_CrossModalSwinTransformer_for_Medical_Image_Segmentation
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (88)
CITATIONS (10)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....