NFDI4DS | UHH-SEMS - Publication Details

RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition

DOI: 10.48550/arxiv.2401.10815 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (15)

Fernando Pérez‐Ga...

Harshita Sharma

Sam Bond-Taylor

Kenza Bouzid

Valentina Salvatelli

Maximilian Ilse

Shruthi Bannur

Daniel C. Castro

Anton Schwaighofer

Matthew P. Lungren

Maria Wetscherek

Noel Codella

Stephanie L. Hyland

Javier Alvarez-Valle

Ozan Oktay

ABSTRACT

Language-supervised pre-training has proven to be a valuable method for extracting semantically meaningful features from images, serving as foundational element in multimodal systems within the computer vision and medical imaging domains. However, resulting are limited by information contained text. This is particularly problematic imaging, where radiologists' written findings focus on specific observations; challenge compounded scarcity of paired imaging-text data due concerns over leakage personal health information. In this work, we fundamentally prevailing reliance language supervision learning general purpose biomedical encoders. We introduce RAD-DINO, image encoder pre-trained solely unimodal that obtains similar or greater performance than state-of-the-art supervised models diverse range benchmarks. Specifically, quality learned representations evaluated standard tasks (classification semantic segmentation), vision-language alignment task (text report generation images). To further demonstrate drawback supervision, show RAD-DINO correlate with other records (e.g., sex age) better language-supervised models, which generally not mentioned radiology reports. Finally, conduct series ablations determining factors RAD-DINO's performance; notably, observe downstream scales well quantity diversity training data, demonstrating image-only scalable approach encoder.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....