paint-brush
FlowVid: Taming Imperfect Optical Flows: Training With Joint Spatial-temporal Conditionsby@escholar
132 reads

FlowVid: Taming Imperfect Optical Flows: Training With Joint Spatial-temporal Conditions

by EScholar: Electronic Academic Papers for Scholars
EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture

EScholar: Electronic Academic Papers for Scholars

@escholar

We publish the best academic work (that's too often lost...

October 9th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper proposes a consistent V2V synthesis framework by jointly leveraging spatial conditions and temporal optical flow clues within the source video.
featured image - FlowVid: Taming Imperfect Optical Flows: Training With Joint Spatial-temporal Conditions
1x
Read by Dr. One voice-avatar

Listen to this story

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars

EScholar: Electronic Academic Papers for Scholars

@escholar

We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

About @escholar
LEARN MORE ABOUT @ESCHOLAR'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

(1) Feng Liang, The University of Texas at Austin and Work partially done during an internship at Meta GenAI (Email: jeffliang@utexas.edu);

(2) Bichen Wu, Meta GenAI and Corresponding author;

(3) Jialiang Wang, Meta GenAI;

(4) Licheng Yu, Meta GenAI;

(5) Kunpeng Li, Meta GenAI;

(6) Yinan Zhao, Meta GenAI;

(7) Ishan Misra, Meta GenAI;

(8) Jia-Bin Huang, Meta GenAI;

(9) Peizhao Zhang, Meta GenAI (Email: stzpz@meta.com);

(10) Peter Vajda, Meta GenAI (Email: vajdap@meta.com);

(11) Diana Marculescu, The University of Texas at Austin (Email: dianam@utexas.edu).

4.2. Training with joint spatial-temporal conditions

Upon expanding the image model, a straightforward method might be to train the video model using paired depth-video data. Yet, our empirical analysis indicates that this leads to sub-optimal results, as detailed in the ablation study in Section 5.4. We hypothesize that this method neglects the temporal clue within the video, making the frame consistency hard to maintain. While some studies, such as Rerender [49] and CoDeF [32], incorporate optical flow in video synthesis, they typically apply it as a rigid constraint. In contrast, our approach uses flow as a soft condition, allowing us to manage the imperfections commonly found in flow estimation.


image

image


This paper is available on arxiv under CC 4.0 license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

TOPICS

THIS ARTICLE WAS FEATURED IN...

Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
X
Escholar
X REMOVE AD