paint-brush
FlowVid: Taming Imperfect Optical Flows: Generation: Edit the First Frame Then Propagateby@kinetograph

FlowVid: Taming Imperfect Optical Flows: Generation: Edit the First Frame Then Propagate

by Kinetograph: The Video Editing Technology Publication
Kinetograph: The Video Editing Technology Publication HackerNoon profile picture

Kinetograph: The Video Editing Technology Publication

@kinetograph

The Kinetograph's the 1st motion-picture camera. At Kinetograph.Tech, we cover...

October 9th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper proposes a consistent V2V synthesis framework by jointly leveraging spatial conditions and temporal optical flow clues within the source video.
featured image - FlowVid: Taming Imperfect Optical Flows: Generation: Edit the First Frame Then Propagate
1x
Read by Dr. One voice-avatar

Listen to this story

Kinetograph: The Video Editing Technology Publication HackerNoon profile picture
Kinetograph: The Video Editing Technology Publication

Kinetograph: The Video Editing Technology Publication

@kinetograph

The Kinetograph's the 1st motion-picture camera. At Kinetograph.Tech, we cover cutting edge tech for video editing.

Learn More
LEARN MORE ABOUT @KINETOGRAPH'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

(1) Feng Liang, The University of Texas at Austin and Work partially done during an internship at Meta GenAI (Email: jeffliang@utexas.edu);

(2) Bichen Wu, Meta GenAI and Corresponding author;

(3) Jialiang Wang, Meta GenAI;

(4) Licheng Yu, Meta GenAI;

(5) Kunpeng Li, Meta GenAI;

(6) Yinan Zhao, Meta GenAI;

(7) Ishan Misra, Meta GenAI;

(8) Jia-Bin Huang, Meta GenAI;

(9) Peizhao Zhang, Meta GenAI (Email: stzpz@meta.com);

(10) Peter Vajda, Meta GenAI (Email: vajdap@meta.com);

(11) Diana Marculescu, The University of Texas at Austin (Email: dianam@utexas.edu).

4.3. Generation: edit the first frame then propagate

image

image

image


Another advantageous strategy we discovered is the integration of self-attention features from DDIM inversion, a technique also employed in works like FateZero [35] and TokenFlow [13]. This integration helps preserve the original structure and motion in the input video. Concretely, we use DDIM inversion to invert the input video with the original prompt and save the intermediate self-attention maps at various timesteps, usually 20. During the generation with the target prompt, we substitute the keys and values in the selfattention modules with these pre-stored maps. Then, during the generation process guided by the target prompt, we replace the keys and values within the self-attention modules with previously saved corresponding maps.


This paper is available on arxiv under CC 4.0 license.


L O A D I N G
. . . comments & more!

About Author

Kinetograph: The Video Editing Technology Publication HackerNoon profile picture
Kinetograph: The Video Editing Technology Publication@kinetograph
The Kinetograph's the 1st motion-picture camera. At Kinetograph.Tech, we cover cutting edge tech for video editing.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
X
Kinetograph
X REMOVE AD