StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Video editing Code (set theory) Image editing
DOI: 10.48550/arxiv.2308.09592 Publication Date: 2023-01-01
ABSTRACT
Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied natural editing practical scenarios. In this paper, we tackle problem by introducing temporal dependency text-driven models, which allows them consistent for the edited objects. Specifically, develop novel inter-frame propagation mechanism editing, leverages concept of layered representations propagate information one frame next. We then build up framework based on mechanism, namely StableVideo, achieve consistency-aware editing. Extensive experiments demonstrate strong capability our approach. Compared with state-of-the-art methods, approach shows superior qualitative quantitative results. Our code is available at \href{https://github.com/rese1f/StableVideo}{this https URL}.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....