Deep Learning Enables Prostate MRI Segmentation: A Large Cohort Evaluation With Inter-Rater Variability Analysis
Pulmonary and Respiratory Medicine
Artificial intelligence
Urology
volume measurement
prostate segmentation
610
deep attentive neural network
Pelvic Floor Disorders
03 medical and health sciences
Segmentation
Magnetic resonance imaging
0302 clinical medicine
Rheumatology
616
Health Sciences
qualitative evaluation
1112 Oncology and Carcinogenesis
Standardisation of Lower Urinary Tract Function
Internal medicine
RC254-282
Cancer
Prostate cancer
Prostate
Cohort
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Prostate Cancer Research and Treatment
quantitative evaluation
Computer science
MRI Imaging
Oncology
Nuclear medicine
Medicine
Medical physics
Radiology
large cohort evaluation
DOI:
10.3389/fonc.2021.801876
Publication Date:
2021-12-21T05:02:02Z
AUTHORS (8)
ABSTRACT
Whole-prostate gland (WPG) segmentation plays a significant role in prostate volume measurement, treatment, and biopsy planning. This study evaluated a previously developed automatic WPG segmentation, deep attentive neural network (DANN), on a large, continuous patient cohort to test its feasibility in a clinical setting. With IRB approval and HIPAA compliance, the study cohort included 3,698 3T MRI scans acquired between 2016 and 2020. In total, 335 MRI scans were used to train the model, and 3,210 and 100 were used to conduct the qualitative and quantitative evaluation of the model. In addition, the DANN-enabled prostate volume estimation was evaluated by using 50 MRI scans in comparison with manual prostate volume estimation. For qualitative evaluation, visual grading was used to evaluate the performance of WPG segmentation by two abdominal radiologists, and DANN demonstrated either acceptable or excellent performance in over 96% of the testing cohort on the WPG or each prostate sub-portion (apex, midgland, or base). Two radiologists reached a substantial agreement on WPG and midgland segmentation (κ = 0.75 and 0.63) and moderate agreement on apex and base segmentation (κ = 0.56 and 0.60). For quantitative evaluation, DANN demonstrated a dice similarity coefficient of 0.93 ± 0.02, significantly higher than other baseline methods, such as DeepLab v3+ and UNet (both p values < 0.05). For the volume measurement, 96% of the evaluation cohort achieved differences between the DANN-enabled and manual volume measurement within 95% limits of agreement. In conclusion, the study showed that the DANN achieved sufficient and consistent WPG segmentation on a large, continuous study cohort, demonstrating its great potential to serve as a tool to measure prostate volume.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (25)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....