NFDI4DS | UHH-SEMS - Publication Details

SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance

Adapter (computing) Scene graph

DOI: 10.48550/arxiv.2405.15321 Publication Date: 2024-05-24

Abstract Supplemental Material References Cited by

AUTHORS (12)

Guibao Shen

Luozhou Wang

Jiantao Lin

Wenhang Ge

Chaozhe Zhang

Xin Tao

Yuan Zhang

Pengfei Wan

Zhongyuan Wang

Guangyong Chen

Yijun Li

Yingcong Chen

ABSTRACT

Recent advancements in text-to-image generation have been propelled by the development of diffusion models and multi-modality learning. However, since text is typically represented sequentially these models, it often falls short providing accurate contextualization structural control. So generated images do not consistently align with human expectations, especially complex scenarios involving multiple objects relationships. In this paper, we introduce Scene Graph Adapter(SG-Adapter), leveraging structured representation scene graphs to rectify inaccuracies original embeddings. The SG-Adapter's explicit non-fully connected graph greatly improves fully connected, transformer-based representations. This enhancement particularly notable maintaining precise correspondence To address challenges posed low-quality annotated datasets like Visual Genome, manually curated a highly clean, multi-relational graph-image paired dataset MultiRels. Furthermore, design three metrics derived from GPT-4V effectively thoroughly measure between graphs. Both qualitative quantitative results validate efficacy our approach controlling

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....