With this new training method developed by NVIDIA, you can train a powerful generative model with one-tenth of the images! This gives us the ability to create many computer vision applications, even when we do not have access to so many images.
The complete article: https://www.louisbouchard.ai/nvidia-ada/
The paper covered:► https://arxiv.org/abs/2006.06676
GitHub with code:► https://github.com/NVlabs/stylegan2-ada
What are GANs ? | Introduction to Generative Adversarial Networks | Face Generation & Editing:►
00:00
these types of models that generate such
00:02
realistic images in different
00:04
applications are called
00:05
guns and they typically need thousands
00:07
and thousands of
00:08
training images now with this new
00:11
training method
00:12
developed by nvidia you can have the
00:14
same results with 1 10 fewer images
00:17
making possible many applications that
00:19
do not have access to many images
00:22
let's see how they achieve that
00:30
this is what's ai and i share artificial
00:32
intelligence news every week
00:34
if you are new to the channel and want
00:35
to stay up to date please consider
00:37
subscribing to not miss any further news
00:40
this new paper covers a technique for
00:42
training a gun architecture
00:44
they are used in many applications
00:46
related to computer vision
00:47
where we want to generate a realistic
00:50
transformation of an image following a
00:51
specific style
00:53
if you are not familiar with how gans
00:55
work i definitely recommend you to watch
00:57
the video i made explaining it before
00:59
continuing this one
01:01
as you know gans architecture trained in
01:03
an adversarial way
01:04
meaning that there are two networks
01:06
training at the same time
01:07
one training to generate a transform
01:09
damage from the input
01:11
the generator and the other one training
01:13
to differentiate the generated images
01:16
from the training images ground truth
01:18
these training images ground troops are
01:20
just the transformation result we would
01:22
like to achieve for each input image
01:25
then we try to optimize both networks at
01:28
the same time
01:29
thus making the generator better and
01:30
better at generating images
01:32
that look real but in order to produce
01:35
these great
01:36
in realistic results we need two things
01:38
a training data set
01:40
composed of thousands and thousands of
01:42
images
01:43
and stopping the training before
01:44
overfitting overfitting during again
01:47
training will mean that our
01:48
discriminator's feedback will become
01:50
meaningless
01:51
and the images generated will only get
01:53
worse
01:54
it happens past a certain point when you
01:56
train your network too much for your
01:58
amount of data
01:59
and the quality only gets worse as you
02:02
can see happening here
02:03
after the black dots these are the
02:05
problems nvidia tackled
02:07
with this paper they realized that this
02:09
is basically the same problem
02:11
and could be solved by one solution they
02:13
proposed a method they called an
02:15
adaptative discriminator augmentation
02:18
their approach is quite simple in theory
02:20
and you can apply it to any gan
02:22
architecture you already have
02:24
without changing anything as you may
02:26
know
02:27
in most areas of deep learning we
02:29
perform what we call data augmentation
02:31
to fight against over-fitting
02:33
in computer vision it often takes the
02:35
form of applying transformations to the
02:38
image during the training phase
02:39
to multiply our quantity of training
02:41
data these transformations can be
02:44
anything
02:44
from applying a rotation adding noise
02:46
changing the colors etc
02:48
to modify our input image and create a
02:51
unique version of it
02:52
making our network train on way more
02:54
diverse data set
02:56
without having to create or find more
02:58
images
02:59
unfortunately this cannot be easily
03:01
applied to a gun architecture
03:03
since the generator will learn to
03:04
generate images following these same
03:06
augmentations
03:08
this is what nvidia's team has done they
03:11
found a way to use these augmentations
03:12
to prevent the model from overfitting
03:15
while ensuring that none of these
03:16
transformations are leaked onto the
03:18
generated images
03:20
they basically apply this set of image
03:22
augmentations to all images shown to the
03:25
discriminator
03:26
with a chosen probability of each
03:28
transformation to randomly occur
03:30
and evaluate the discriminator's
03:31
performance using
03:33
these modified images this high number
03:36
of transformations all applied randomly
03:38
makes it very unlikely that the
03:40
discriminator sees even
03:42
one unchanged image of course the
03:44
generator is trained and guided to
03:46
generate only clean images
03:47
without any transformations they've
03:50
concluded that this method of training a
03:52
gan architecture with augmented data
03:55
shown to the discriminator works only if
03:57
each transformation's occurrence
03:59
probability is below 80 the higher it is
04:03
the more augmentations will be applied
04:05
and thus a more diverse training data
04:07
set you will have
04:08
they found that while this was solving
04:10
the question of limited
04:11
amount of training images there was
04:13
still the overfitting issue that
04:15
appeared at different times based on
04:17
your initial dataset size
04:19
this is why they thought of an
04:20
adaptative way of doing the segmentation
04:24
instead of having another hyperparameter
04:26
to decide the ideal augmentation
04:28
probability of appearance
04:30
they instead control the augmentation
04:32
strength during the training
04:34
starting at 0 and then adjust its value
04:36
iteratively based on the difference
04:38
between the training and validation
04:40
sets indicating if overfitting is
04:42
happening or not
04:44
this validation set is just a different
04:46
set of the same type of images that the
04:48
network is not trained on
04:51
the validation set just needs to be made
04:53
of images that the discriminator hasn't
04:56
seen before
04:57
it is used to measure the quality of our
04:59
results and quantify the degree of
05:01
divergence of our network
05:03
quantifying overfitting at the same time
05:06
here you can see the results of this
05:08
adaptative discriminator augmentation
05:10
for multiple training set sizes on the
05:12
ffhq dataset
05:14
here we use the fid measure which you
05:17
can see getting better and better over
05:18
time
05:19
and never reaching this overfitting
05:21
problem where it starts to get only
05:23
worse
05:24
the fid or frechette inception distance
05:27
is basically
05:28
a measure of the distance between the
05:30
distributions for generated and real
05:32
images
05:33
it measures the quality of generated
05:35
image samples
05:37
the lower it is the better our results
05:40
this ffhq dataset contains
05:42
7000 high quality faces taken from
05:44
flickr
05:45
it was created as a benchmark for
05:47
generative adversarial networks
05:50
and indeed they successfully matched
05:52
style gun 2 results with an order of
05:54
magnitude fewer images
05:56
used as you can see here where the
05:58
results are plotted for 1 000 to 140 000
06:02
training examples
06:03
using again this same fid measure on the
06:06
ffhq dataset
06:08
of course the code is also completely
06:10
available and easy to implement to your
06:12
gan architecture using tensorflow
06:14
both the code and the paper are linked
06:16
in the description if you would like to
06:18
implement this
06:18
in your code or have a deeper
06:20
understanding of the technique by
06:22
reading the paper
06:23
this paper was just published in the nur
06:25
ips 2020 as well as another announcement
06:28
by nvidia
06:30
they announced a new program called the
06:32
applied research accelerator
06:34
program their goal here is to support
06:36
research projects
06:37
to make a real world impact through
06:39
deployment into gpu accelerated
06:42
applications
06:43
adopted by commercial and government
06:45
organizations
06:46
granting hardware funding technical
06:49
guidance
06:50
support and more to researchers you
06:52
should definitely give it a look if that
06:54
fits your current needs
06:55
i linked it in the description of the
06:57
video as well
06:59
please leave a like if you went this far
07:01
in the video
07:02
and since there are over 90 percent of
07:03
you guys watching that are not
07:05
subscribed yet
07:06
consider subscribing to the channel to
07:08
not miss any further news clearly
07:10
explained
07:11
thank you for watching