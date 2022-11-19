Too Long; Didn't Read Have you ever imagined being able to take a picture and just magically dive into it as if it would be a door to another world? Well, whether you thought about this or not, some people did, and thanks to them, it is now possible with AI! This is just one step away from teleportation and being able to be there physically. Maybe one day AI will help with that and fix an actual problem too! I’m just kidding, this is really cool, and I’m glad some people are working on it. This is InfiniteNature… Zero! It is called this way because it is a follow-up on a paper I previously covered called InfiniteNature. What’s the difference? Quality!

Video Transcript

have you ever imagined being able to

take a picture and just magically dive

into it as if it will be a door to

another world well whether you thought

about this or not some people did and

thanks to them it's now possible with AI

this is just one step away from

teleportation and being able to be there

physically maybe one day AI will help

with that and fix an actual problem too

I'm just kidding this is really cool and

I'm glad some people are working on it

this is infinite nature zero it's called

this way because it's a follow-up on a

paper I previously covered called

infinite nature what's the difference

quality just look at that it's so much

better in only one paper it's incredible

you can actually feel like you are

diving into the picture and it only

requires one input picture how cool is

that the only thing even cooler is how

it works let's dive into it but first

allow me 10 seconds of your time for a

sponsor of this video myself yes only 10

seconds I don't think I deserve more

compared to the amazing companies that

usually sponsor my work if you like the

videos first I think you should

subscribe to the channel but I also

think you will love my two newsletters

where I share daily research papers and

news and the weekly one where I share

these videos and very interesting

discussions related to these papers and

AI ethics you should probably follow me

on Twitter as well at what's AI if you'd

like to stay up to date with the news

and papers in the field tons are coming

out with the cvpr deadlines that just

passed and you don't want to miss out on

those so how does infinite nature zero

work it all starts with a single image

you send as input yes a single image it

doesn't require a video or multiple

views or anything else this is different

from their previous paper that I also

covered where they needed videos to help

the model understand natural scenes

during training which is also why they

call this model infinite nature zero

because it requires zero videos here

their work is divided into three methods

used during training in order to get

those results to start the model

randomly samples two virtual camera

trajectories which will tell you where

you are going in the image why too

because the force is necessary to

generate a new view telling you where to

fly into the image to generate a second

image this is the actual trajectory you

will be taking the second virtual

trajectory is used during training to

dive and return to the original image to

teach the model to learn geometry aware

view refinement during view generation

in a self-supervised way as we teach it

to get back to an image we already have

in our training data set they refer to

this approach as a cyclic virtual camera

trajectory as the starting and ending

views are the same our input image they

do that by going to a virtual or fake

sample Viewpoint and returning to the

original view afterward just to teach

the Reconstruction part to the model the

viewpoints are sampled using an

algorithm called the autopilot algorithm

to find the sky and not Skydive into

rocks or the ground as nobody would like

to do that then during training we use a

gun-like approach using a discriminator

to measure how much the new view

generated looks like a real image

represented with L adversarial or ladv

so yes guns aren't dead yet this is a

very cool application of them for

guiding the training when you don't have

any ground roof for example when you

don't have infinite images in this case

basically they use another model a

discriminator trained on our training

data set that can see if an image seems

to be part of it or Not So based on its

answer you can improve the generation to

make it look like an image from our data

set which supposedly looks realistic we

also measure the difference between our

regenerated initial image and the

original one to help the model

iteratively get better at reconstruct

acting it represented by L Rick here and

we simply repeat this process multiple

times to generate our novel frames and

create these kinds of videos there's one

last thing to tweak before getting those

amazing results they saw that with their

approach the sky due to its infinite

nature compared to the ground changes

way too quickly to fix that they use

another segmentation model to find the

sky automatically in the generated

images and fix it using an intelligent

blending system between the generated

sky and the sky from our initial image

so that it doesn't change too quickly

and unrealistically after training with

this two-step process and scale

refinement infinite nature 0 allows you

to have stable long-range trajectories

for natural scenes as well as accurately

generate Noble views that are

geometrically coherent and voila this is

how you can take a picture and dive into

it as if you were a bird I invite you to

read their paper for more details on

their method and in limitation

especially regarding how they achieve to

train their model in such a clever way

as I omitted some technical details

making this possible for Simplicity by

the way the code is available and linked

below if you'd like to try it let me

know if you do and send me the results

I'd love to see them thank you for

watching and I hope you've enjoyed this

video I will see you next week with

another amazing paper





