NFDI4DS | UHH-SEMS - Publication Details

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology

DOI: 10.1609/aaai.v38i3.27963 Publication Date: 2024-03-25T09:19:00Z

Abstract Supplemental Material References Cited by

AUTHORS (6)

Zhaopeng Gu

Bingke Zhu

Guibo Zhu

Yingying Chen

Ming Tang

Jinqiao Wang

ABSTRACT

Large Vision-Language Models (LVLMs) such as MiniGPT-4 and LLaVA have demonstrated the capability of understanding images achieved remarkable performance in various visual tasks. Despite their strong abilities recognizing common objects due to extensive training datasets, they lack specific domain knowledge a weaker localized details within objects, which hinders effectiveness Industrial Anomaly Detection (IAD) task. On other hand, most existing IAD methods only provide anomaly scores necessitate manual setting thresholds distinguish between normal abnormal samples, restricts practical implementation. In this paper, we explore utilization LVLM address problem propose AnomalyGPT, novel approach based on LVLM. We generate data by simulating anomalous producing corresponding textual descriptions for each image. also employ an image decoder fine-grained semantic design prompt learner fine-tune using embeddings. Our AnomalyGPT eliminates need threshold adjustments, thus directly assesses presence locations anomalies. Additionally, supports multi-turn dialogues exhibits impressive few-shot in-context learning capabilities. With one shot, achieves state-of-the-art with accuracy 86.1%, image-level AUC 94.1%, pixel-level 95.3% MVTec-AD dataset.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (47)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....