Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models

Image editing
DOI: 10.48550/arxiv.2305.04441 Publication Date: 2023-01-01
ABSTRACT
Recently large-scale language-image models (e.g., text-guided diffusion models) have considerably improved the image generation capabilities to generate photorealistic images in various domains. Based on this success, current editing methods use texts achieve intuitive and versatile modification of images. To edit a real using models, one must first invert noisy latent from which an edited is sampled with target text prompt. However, most lack following: user-friendliness additional masks or precise descriptions input are required), generalization larger domains, high fidelity image. In paper, we design accurate quick inversion technique, Prompt Tuning Inversion, for text-driven editing. Specifically, our proposed method consists reconstruction stage stage. stage, encode information into learnable conditional embedding via Inversion. second apply classifier-free guidance sample image, where calculated by linearly interpolating between optimized obtained This technique ensures superior trade-off editability method. For example, can change color specific object while preserving its original shape background under only Extensive experiments ImageNet demonstrate performance compared state-of-the-art baselines.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....