Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
FOS: Computer and information sciences
Computer Vision and Pattern Recognition (cs.CV)
Computer Science - Computer Vision and Pattern Recognition
DOI:
10.48550/arxiv.2405.08786
Publication Date:
2024-05-14
AUTHORS (7)
ABSTRACT
The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in the diagnosis of clinically significant prostate cancer through MRI imaging. Current deep learning-based PI-RADS scoring methods often lack incorporation essential clinical guidelines~(PICG) utilized by radiologists, potentially compromising accuracy. This paper introduces a novel approach that adapts multi-modal large language model (MLLM) to incorporate PICG into without additional annotations network parameters. We present two-stage fine-tuning process aimed at adapting MLLMs originally trained on natural images data domain while effectively integrating PICG. In first stage, we develop adapter layer specifically tailored for processing 3D image inputs design MLLM instructions differentiate modalities effectively. second translate guiding generate PICG-guided features. Through feature distillation, align features with feature, enabling information. our public dataset evaluate it real-world challenging in-house dataset. Experimental results demonstrate improves performance current networks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....