Introducing NVIDIA's EditGAN: Alter Images Instantly via Quick Sketches

Written by whatsai | Published 2021/12/05
Tech Story Tags: ai | nvidia | artificial-intelligence | gans | computer-vision | machine-learning | innovation | ml | web-monetization

TLDREditGAN allows you to control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same! SOTA Image Editing from sketches model based on GANs by NVIDIA, MIT and UofT. Watch more results and learn how it works in the video! Watch the video below to see how EditGAN works: https://www.youtube.com/watch?time_continue=11&v=bus4OGyMQec&feature=emb_logovia the TL;DR App

Have you ever dreamed of being able to edit any part of a picture with quick sketches or suggestions? Or maybe you wanted to change specific features like the eyes, eyebrows or someone in an image, or even the wheels of your car? Well, it is not only possible, but it is now easier than ever with this new model called EditGAN, and the results are really impressive!
Control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same! SOTA Image Editing from sketches model based on GANs by NVIDIA, MIT and UofT. Watch more results and learn how it works in the video!

Watch the video

References

►Read the full article: https://www.louisbouchard.ai/editgan/
► Paper: Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A. and
Fidler, S., 2021, May. EditGAN: High-Precision Semantic Image Editing.
In Thirty-Fifth Conference on Neural Information Processing Systems.
► Code and interactive tool (arriving soon): https://nv-tlabs.github.io/editGAN/
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/

Video Transcript

00:00
have you ever dreamed of being able to
00:01
edit any part of a picture with quick
00:04
sketches or suggestions well it's not
00:06
only possible but it has never been
00:08
easier than now with this new model
00:11
called edit gun and the results are
00:13
really impressive you can basically
00:15
improve or mimify any picture super
00:18
quickly indeed you can control whatever
00:20
feature you want from quick drafts and
00:22
it will only edit the modifications
00:24
keeping the rest of the image the same
00:27
control has been sucked and extremely
00:29
challenging to obtain with image
00:30
synthesis and image editing ai models
00:33
like guns you can see how having extra
00:35
control is useful for image editing and
00:38
how it improves the quality of the work
00:40
you create when running machine learning
00:42
projects the quality of the work you
00:44
produce is directly correlated with the
00:46
quality of your tools and the level of
00:48
control they give thankfully for us it's
00:50
easier to take control of your machine
00:52
learning projects using this episode's
00:54
sponsor weights and biases by tracking
00:57
all of the input hyper parameters output
01:00
metrics and any insights that you or
01:02
your team have you know that your work
01:04
is saved and under control one aspect
01:06
that i love for teams is weights and
01:08
biases reports i love how i can easily
01:10
capture all of my projects charts and
01:12
findings in reports that i can share
01:15
with my team and get feedback the charts
01:17
are interactive and tracked with weight
01:19
and biases so i know my work is
01:21
reproducible i feel lucky that i get to
01:23
spend time trying to make research look
01:25
simple and clear for you all and that
01:28
weights and biases is trying to do the
01:30
same with their platform i'd love for
01:32
you to check them out with the first
01:33
link below because they are helping me
01:35
to continue making these videos and grow
01:37
this channel
01:39
as we said this fantastic new paper from
01:42
nvidia the university of toronto and mit
01:45
allows you to edit any picture with
01:47
superb control over specific features
01:50
from sketch inputs typically controlling
01:52
specific features requires huge data
01:55
sets and experts to know which features
01:57
to change within the model to have the
02:00
desired output image with only the
02:02
wanted features changed instead it again
02:05
learns through only a handful of
02:07
examples of labeled images to match
02:10
segmentation to images allowing you to
02:13
edit the images with segmentation or in
02:15
other words with quick sketches it
02:17
preserves the full image quality while
02:20
allowing a level of detail and freedom
02:22
never achieved before this is such a
02:24
great jump forward but what's even
02:26
cooler is how they achieve that so let's
02:29
dive a bit deeper into their model first
02:31
the model uses talgen 2 to generate
02:34
images which is the best image
02:36
generation model available at the time
02:38
of the publication and is widely used in
02:40
research i won't dive into the details
02:42
of this model since i already covered it
02:44
in numerous videos with different
02:46
applications if you'd like to learn more
02:48
about it instead i will assume you have
02:50
a basic knowledge of what style gun 2
02:52
does take an image encode it into a
02:55
condensed subspace and use a type of
02:58
model called a generator to transform
03:00
this encoded subspace into another image
03:03
this also works using directly encoded
03:05
information instead of encoding an image
03:08
to obtain this information what's
03:10
important here is the generator as i
03:12
said it will take information from a
03:14
subspace often referred to as latent
03:17
space where we have a lot of information
03:19
about our image and its features but the
03:22
space is multi-dimensional and we can
03:24
hardly visualize it the challenge is to
03:26
identify which part of the subspace is
03:28
responsible for reconstructing which
03:30
feature in the image this is where
03:32
editgan comes into play not only telling
03:35
you which part of the subspace does what
03:37
but also allowing you to edit them
03:39
automatically using another input a
03:42
sketch that you can easily draw indeed
03:44
it will encode your image or simply take
03:46
a specific latent code and generate both
03:49
the segmentation map of the picture and
03:51
the picture itself this means that both
03:53
the segmentation and images are in the
03:55
same subspace by training a model to do
03:58
that and it allows for the control of
04:00
only the desired features without you
04:02
having to do anything else as you simply
04:05
need to change a segmentation image and
04:07
the other will follow the training will
04:09
only be on this new segmentation
04:11
generation and the style gun generator
04:13
will stay fixed for the original image
04:15
this will allow the model to understand
04:17
and link the segmentations to the same
04:20
subspace needed for the generator to
04:22
reconstruct the image then if trained
04:24
correctly you can simply edit the
04:26
segmentation and it will change the
04:28
image accordingly edit can will
04:30
basically assign each pixel of your
04:32
image to a specific class such as head
04:34
ear eye etc and control these classes
04:38
independently using masks covering the
04:40
pixels of other classes within the
04:42
latent space so each pixel will have its
04:45
label and edit gun will decide which
04:48
label to edit instead of which pixel
04:50
directly in the latent space and we
04:52
construct the image modifying only the
04:55
editing region and voila by connecting a
04:58
generated image with a segmentation map
05:00
edit gun allows you to edit this map as
05:03
you wish and apply these modifications
05:05
to the image creating a new version of
05:08
course after training with these
05:09
examples it works with unseen images and
05:12
like all guns the results are limited to
05:14
the kind of images it was trained with
05:16
so you cannot use this model on images
05:18
of cats if you trained it with images of
05:21
cars still it's quite impressive and i
05:23
love how researchers try to provide ways
05:26
to play with gans intuitively like using
05:28
sketches instead of parameters the code
05:31
isn't available for the moment but it
05:33
will be available soon and i'm excited
05:34
to try it out this was just an overview
05:37
of this amazing new paper and i will
05:39
strongly invite you to read their paper
05:41
for a deeper technical understanding let
05:43
me know what you think and i hope you've
05:44
enjoyed this video as much as i enjoyed
05:47
learning about this new model thank you
05:49
once again to weights and biases for
05:50
sponsoring the video and to you that is
05:52
still watching see you next week with a
05:55
very special and exciting video about
05:57
the subject i love
05:58
[Music]
06:11
you



Written by whatsai | I explain Artificial Intelligence terms and news to non-experts.
Published by HackerNoon on 2021/12/05