NFDI4DS | UHH-SEMS - Publication Details

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition

DOI: 10.48550/arxiv.2310.19512 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (12)

Haoxin Chen

Menghan Xia

Yingqing He

Yong Zhang

Xiaodong Cun

Shaoshu Yang

Jinbo Xing

Yaofang Liu

Qifeng Chen

Xintao Wang

Chao Weng

Ying Shan

ABSTRACT

Video generation has increasingly gained interest in both academia and industry. Although commercial tools can generate plausible videos, there is a limited number of open-source models available for researchers engineers. In this work, we introduce two diffusion high-quality video generation, namely text-to-video (T2V) image-to-video (I2V) models. T2V synthesize based on given text input, while I2V incorporate an additional image input. Our proposed model realistic cinematic-quality videos with resolution $1024 \times 576$, outperforming other terms quality. The designed to produce that strictly adhere the content provided reference image, preserving its content, structure, style. This first foundation capable transforming into clip maintaining preservation constraints. We believe these will contribute significantly technological advancements within community.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....