NFDI4DS | UHH-SEMS - Publication Details

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

Generative model Modalities Representation Feature Learning

DOI: 10.48550/arxiv.2306.04811 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Yinda Chen

Che Liu

Wei Huang

Sibo Cheng

Rossella Arcucci

Zhiwei Xiong

ABSTRACT

Vision-Language Pretraining (VLP) has demonstrated remarkable capabilities in learning visual representations from textual descriptions of images without annotations. Yet, effective VLP demands large-scale image-text pairs, a resource that suffers scarcity the medical domain. Moreover, conventional is limited to 2D while encompass diverse modalities, often 3D, making process more challenging. To address these challenges, we present Generative Text-Guided 3D for Unified Medical Image Segmentation (GTGM), framework extends relying on paired descriptions. Specifically, GTGM utilizes large language models (LLM) generate medical-style text images. This synthetic then used supervise representation learning. Furthermore, negative-free contrastive objective strategy introduced cultivate consistent between augmented image patches, which effectively mitigates biases associated with strict positive-negative sample pairings. We evaluate three imaging modalities - Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and electron microscopy (EM) over 13 datasets. GTGM's superior performance across various segmentation tasks underscores its effectiveness versatility, by enabling extension into imagery bypassing need text.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....