NFDI4DS | UHH-SEMS - Publication Details

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Fresco Zero (linguistics)

DOI: 10.48550/arxiv.2403.12962 Publication Date: 2024-03-19

Abstract Supplemental Material References Cited by

AUTHORS (4)

Shuai Yang

Yifan Zhou

Ziwei Liu

Chen Change Loy

ABSTRACT

The remarkable efficacy of text-to-image diffusion models has motivated extensive exploration their potential application in video domains. Zero-shot methods seek to extend image videos without necessitating model training. Recent mainly focus on incorporating inter-frame correspondence into attention mechanisms. However, the soft constraint imposed determining where attend valid features can sometimes be insufficient, resulting temporal inconsistency. In this paper, we introduce FRESCO, intra-frame alongside establish a more robust spatial-temporal constraint. This enhancement ensures consistent transformation semantically similar content across frames. Beyond mere guidance, our approach involves an explicit update achieve high consistency with input video, significantly improving visual coherence translated videos. Extensive experiments demonstrate effectiveness proposed framework producing high-quality, coherent videos, marking notable improvement over existing zero-shot methods.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....