NFDI4DS | UHH-SEMS - Publication Details

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

FOS: Computer and information sciences Computer Science - Computation and Language Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2309.17421 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Zhengyuan Yang

Linjie Li

Kevin Lin

Jianfeng Wang

Chung-Ching Lin

Zicheng Liu

Lijuan Wang

ABSTRACT

Large multimodal models (LMMs) extend large language (LLMs) with multi-sensory skills, such as visual understanding, to achieve stronger generic intelligence. In this paper, we analyze the latest model, GPT-4V(ision), deepen understanding of LMMs. The analysis focuses on intriguing tasks that GPT-4V can perform, containing test samples probe quality and genericity GPT-4V's capabilities, its supported inputs working modes, effective ways prompt model. our approach exploring GPT-4V, curate organize a collection carefully designed qualitative spanning variety domains tasks. Observations from these demonstrate unprecedented ability in processing arbitrarily interleaved capabilities together make powerful generalist system. Furthermore, unique capability markers drawn input images give rise new human-computer interaction methods referring prompting. We conclude report in-depth discussions emerging application scenarios future research directions for GPT-4V-based systems. hope preliminary exploration will inspire next-generation task formulation, exploit enhance LMMs solve real-world problems, gaining better foundation models. Finally, acknowledge model under study is solely product OpenAI's innovative work, they should be fully credited development. Please see contributions paper authorship credit attribution: https://cdn.openai.com/contributions/gpt-4v.pdf

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....