NFDI4DS | UHH-SEMS - Publication Details

Multi-sentence Video Grounding for Long Video Generation

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition

DOI: 10.48550/arxiv.2407.13219 Publication Date: 2024-07-18

Abstract Supplemental Material References Cited by

AUTHORS (5)

Wei Feng

Xin Wang

Hong Chen

Ze-Yang Zhang

Wenwu Zhu

ABSTRACT

Video generation has witnessed great success recently, but their application in generating long videos still remains challenging due to the difficulty maintaining temporal consistency of generated and high memory cost during generation. To tackle problems, this paper, we propose a brave new idea Multi-sentence Grounding for Long Generation, connecting massive video moment retrieval task first time, providing paradigm The method our work can be summarized as three steps: (i) We design sequential scene text prompts queries grounding, utilizing search segments that meet requirements database. (ii) Based on source frames retrieved segments, adopt editing methods create content while preserving video. Since conducted segment by segment, even frame frame, it largely reduces cost. (iii) also attempt morphing personalized improve subject generation, ablation experimental results subtasks Our approach seamlessly extends development image/video editing, grounding offering effective solutions at low

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Multi-sentence Video Grounding for Long Video Generation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....