VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition
DOI: 10.48550/arxiv.2310.19512 Publication Date: 2023-01-01
ABSTRACT
Video generation has increasingly gained interest in both academia and industry. Although commercial tools can generate plausible videos, there is a limited number of open-source models available for researchers engineers. In this work, we introduce two diffusion high-quality video generation, namely text-to-video (T2V) image-to-video (I2V) models. T2V synthesize based on given text input, while I2V incorporate an additional image input. Our proposed model realistic cinematic-quality videos with resolution $1024 \times 576$, outperforming other terms quality. The designed to produce that strictly adhere the content provided reference image, preserving its content, structure, style. This first foundation capable transforming into clip maintaining preservation constraints. We believe these will contribute significantly technological advancements within community.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....