NFDI4DS | UHH-SEMS - Publication Details

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition

DOI: 10.48550/arxiv.2307.01097 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (5)

Tang, Shitao

Zhang, Fuyang

Chen, Jiacheng

Wang, Peng

Furukawa, Yasutaka

ABSTRACT

This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e.g., perspective crops from a panorama or multi-view images given depth maps and poses). Unlike prior methods that rely on iterative image warping and inpainting, MVDiffusion simultaneously generates all images with a global awareness, effectively addressing the prevalent error accumulation issue. At its core, MVDiffusion processes perspective images in parallel with a pre-trained text-to-image diffusion model, while integrating novel correspondence-aware attention layers to facilitate cross-view interactions. For panorama generation, while only trained with 10k panoramas, MVDiffusion is able to generate high-resolution photorealistic images for arbitrary texts or extrapolate one perspective image to a 360-degree view. For multi-view depth-to-image generation, MVDiffusion demonstrates state-of-the-art performance for texturing a scene mesh.<br/>Project page, https://mvdiffusion.github.io; NeurIPS 2023 (spotlight); Compressed camera-ready version<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....