FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition
DOI: 10.48550/arxiv.2406.16863 Publication Date: 2024-06-24
ABSTRACT
Diffusion model has demonstrated remarkable capability in video generation, which further sparks interest introducing trajectory control into the generation process. While existing works mainly focus on training-based methods (e.g., conditional adapter), we argue that diffusion itself allows decent over generated content without requiring any training. In this study, introduce a tuning-free framework to achieve trajectory-controllable by imposing guidance both noise construction and attention computation. Specifically, 1) first show several instructive phenomenons analyze how initial noises influence motion of content. 2) Subsequently, propose FreeTraj, approach enables modifying sampling mechanisms. 3) Furthermore, extend FreeTraj facilitate longer larger with controllable trajectories. Equipped these designs, users have flexibility provide trajectories manually or opt for automatically LLM planner. Extensive experiments validate efficacy our enhancing controllability models.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....