Introducing NVIDIA's EditGAN: Alter Images Instantly via Quick Sketches

Written by whatsai | Published 2021/12/05
Tech Story Tags: ai | nvidia | artificial-intelligence | gans | computer-vision | machine-learning | innovation | ml | web-monetization

TLDR

EditGAN allows you to control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same! SOTA Image Editing from sketches model based on GANs by NVIDIA, MIT and UofT. Watch more results and learn how it works in the video! Watch the video below to see how EditGAN works: https://www.youtube.com/watch?time_continue=11&v=bus4OGyMQec&feature=emb_logovia the TL;DR App

Have you ever dreamed of being able to edit any part of a picture with quick sketches or suggestions? Or maybe you wanted to change specific features like the eyes, eyebrows or someone in an image, or even the wheels of your car? Well, it is not only possible, but it is now easier than ever with this new model called EditGAN, and the results are really impressive!

Control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same! SOTA Image Editing from sketches model based on GANs by NVIDIA, MIT and UofT. Watch more results and learn how it works in the video!

Watch the video

References

►Read the full article: https://www.louisbouchard.ai/editgan/
► Paper: Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A. and
Fidler, S., 2021, May. EditGAN: High-Precision Semantic Image Editing.
In Thirty-Fifth Conference on Neural Information Processing Systems.
► Code and interactive tool (arriving soon): https://nv-tlabs.github.io/editGAN/
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/

Video Transcript

00:00

have you ever dreamed of being able to

00:01

edit any part of a picture with quick

00:04

sketches or suggestions well it's not

00:06

only possible but it has never been

00:08

easier than now with this new model

00:11

called edit gun and the results are

00:13

really impressive you can basically

00:15

improve or mimify any picture super

00:18

quickly indeed you can control whatever

00:20

feature you want from quick drafts and

00:22

it will only edit the modifications

00:24

keeping the rest of the image the same

00:27

control has been sucked and extremely

00:29

challenging to obtain with image

00:30

synthesis and image editing ai models

00:33

like guns you can see how having extra

00:35

control is useful for image editing and

00:38

how it improves the quality of the work

00:40

you create when running machine learning

00:42

projects the quality of the work you

00:44

produce is directly correlated with the

00:46

quality of your tools and the level of

00:48

control they give thankfully for us it's

00:50

easier to take control of your machine

00:52

learning projects using this episode's

00:54

sponsor weights and biases by tracking

00:57

all of the input hyper parameters output

01:00

metrics and any insights that you or

01:02

your team have you know that your work

01:04

is saved and under control one aspect

01:06

that i love for teams is weights and

01:08

biases reports i love how i can easily

01:10

capture all of my projects charts and

01:12

findings in reports that i can share

01:15

with my team and get feedback the charts

01:17

are interactive and tracked with weight

01:19

and biases so i know my work is

01:21

reproducible i feel lucky that i get to

01:23

spend time trying to make research look

01:25

simple and clear for you all and that

01:28

weights and biases is trying to do the

01:30

same with their platform i'd love for

01:32

you to check them out with the first

01:33

link below because they are helping me

01:35

to continue making these videos and grow

01:37

this channel

01:39

as we said this fantastic new paper from

01:42

nvidia the university of toronto and mit

01:45

allows you to edit any picture with

01:47

superb control over specific features

01:50

from sketch inputs typically controlling

01:52

specific features requires huge data

01:55

sets and experts to know which features

01:57

to change within the model to have the

02:00

desired output image with only the

02:02

wanted features changed instead it again

02:05

learns through only a handful of

02:07

examples of labeled images to match

02:10

segmentation to images allowing you to

02:13

edit the images with segmentation or in

02:15

other words with quick sketches it

02:17

preserves the full image quality while

02:20

allowing a level of detail and freedom

02:22

never achieved before this is such a

02:24

great jump forward but what's even

02:26

cooler is how they achieve that so let's

02:29

dive a bit deeper into their model first

02:31

the model uses talgen 2 to generate

02:34

images which is the best image

02:36

generation model available at the time

02:38

of the publication and is widely used in

02:40

research i won't dive into the details

02:42

of this model since i already covered it

02:44

in numerous videos with different

02:46

applications if you'd like to learn more

02:48

about it instead i will assume you have

02:50

a basic knowledge of what style gun 2

02:52

does take an image encode it into a

02:55

condensed subspace and use a type of

02:58

model called a generator to transform

03:00

this encoded subspace into another image

03:03

this also works using directly encoded

03:05

information instead of encoding an image

03:08

to obtain this information what's

03:10

important here is the generator as i

03:12

said it will take information from a

03:14

subspace often referred to as latent

03:17

space where we have a lot of information

03:19

about our image and its features but the

03:22

space is multi-dimensional and we can

03:24

hardly visualize it the challenge is to

03:26

identify which part of the subspace is

03:28

responsible for reconstructing which

03:30

feature in the image this is where

03:32

editgan comes into play not only telling

03:35

you which part of the subspace does what

03:37

but also allowing you to edit them

03:39

automatically using another input a

03:42

sketch that you can easily draw indeed

03:44

it will encode your image or simply take

03:46

a specific latent code and generate both

03:49

the segmentation map of the picture and

03:51

the picture itself this means that both

03:53

the segmentation and images are in the

03:55

same subspace by training a model to do

03:58

that and it allows for the control of

04:00

only the desired features without you

04:02

having to do anything else as you simply

04:05

need to change a segmentation image and

04:07

the other will follow the training will

04:09

only be on this new segmentation

04:11

generation and the style gun generator

04:13

will stay fixed for the original image

04:15

this will allow the model to understand

04:17

and link the segmentations to the same

04:20

subspace needed for the generator to

04:22

reconstruct the image then if trained

04:24

correctly you can simply edit the

04:26

segmentation and it will change the

04:28

image accordingly edit can will

04:30

basically assign each pixel of your

04:32

image to a specific class such as head

04:34

ear eye etc and control these classes

04:38

independently using masks covering the

04:40

pixels of other classes within the

04:42

latent space so each pixel will have its

04:45

label and edit gun will decide which

04:48

label to edit instead of which pixel

04:50

directly in the latent space and we

04:52

construct the image modifying only the

04:55

editing region and voila by connecting a

04:58

generated image with a segmentation map

05:00

edit gun allows you to edit this map as

05:03

you wish and apply these modifications

05:05

to the image creating a new version of

05:08

course after training with these

05:09

examples it works with unseen images and

05:12

like all guns the results are limited to

05:14

the kind of images it was trained with

05:16

so you cannot use this model on images

05:18

of cats if you trained it with images of

05:21

cars still it's quite impressive and i

05:23

love how researchers try to provide ways

05:26

to play with gans intuitively like using

05:28

sketches instead of parameters the code

05:31

isn't available for the moment but it

05:33

will be available soon and i'm excited

05:34

to try it out this was just an overview

05:37

of this amazing new paper and i will

05:39

strongly invite you to read their paper

05:41

for a deeper technical understanding let

05:43

me know what you think and i hope you've

05:44

enjoyed this video as much as i enjoyed

05:47

learning about this new model thank you

05:49

once again to weights and biases for

05:50

sponsoring the video and to you that is

05:52

still watching see you next week with a

05:55

very special and exciting video about

05:57

the subject i love

05:58

[Music]

06:11

you

Written by whatsai | I explain Artificial Intelligence terms and news to non-experts.

Published by HackerNoon on 2021/12/05