NFDI4DS | UHH-SEMS - Publication Details

Capability-aware Prompt Reformulation Learning for Text-to-Image Generation

FOS: Computer and information sciences Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Computation and Language (cs.CL) Information Retrieval (cs.IR) Computer Science - Information Retrieval

DOI: 10.48550/arxiv.2403.19716 Publication Date: 2024-03-27

Abstract Supplemental Material References Cited by

AUTHORS (5)

Jingtao Zhan

Qingyao Ai

Yiqun Liu

Jia Chen

Shaoping Ma

ABSTRACT

Text-to-image generation systems have emerged as revolutionary tools in the realm of artistic creation, offering unprecedented ease transforming textual prompts into visual art. However, efficacy these is intricately linked to quality user-provided prompts, which often poses a challenge users unfamiliar with prompt crafting. This paper addresses this by leveraging user reformulation data from interaction logs develop an automatic model. Our in-depth analysis reveals that heavily dependent on individual user's capability, resulting significant variance pairs. To effectively use for training, we introduce Capability-aware Prompt Reformulation (CAPR) framework. CAPR innovatively integrates capability process through two key components: Conditional Model (CRM) and Configurable Capability Features (CCF). CRM reformulates according specified represented CCF. The CCF, turn, offers flexibility tune guide CRM's behavior. enables learn diverse strategies across various capacities simulate high-capability during inference. Extensive experiments standard text-to-image benchmarks showcase CAPR's superior performance over existing baselines its remarkable robustness unseen systems. Furthermore, comprehensive analyses validate effectiveness different components. can facilitate user-friendly make advanced creation more achievable broader range users.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Capability-aware Prompt Reformulation Learning for Text-to-Image Generation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....