With this new training method developed by NVIDIA, you can train a powerful generative model with one-tenth of the images! This gives us the ability to create many computer vision applications, even when we do not have access to so many images. Watch the video References The complete article: The paper covered:► GitHub with code:► What are GANs ? | Introduction to Generative Adversarial Networks | Face Generation & Editing:► https://www.louisbouchard.ai/nvidia-ada/ https://arxiv.org/abs/2006.06676 https://github.com/NVlabs/stylegan2-ada https://youtu.be/ZnpZsiy_p2M Video Transcript 00:00 these types of models that generate such 00:02 realistic images in different 00:04 applications are called 00:05 guns and they typically need thousands 00:07 and thousands of 00:08 training images now with this new 00:11 training method 00:12 developed by nvidia you can have the 00:14 same results with 1 10 fewer images 00:17 making possible many applications that 00:19 do not have access to many images 00:22 let's see how they achieve that 00:30 this is what's ai and i share artificial 00:32 intelligence news every week 00:34 if you are new to the channel and want 00:35 to stay up to date please consider 00:37 subscribing to not miss any further news 00:40 this new paper covers a technique for 00:42 training a gun architecture 00:44 they are used in many applications 00:46 related to computer vision 00:47 where we want to generate a realistic 00:50 transformation of an image following a 00:51 specific style 00:53 if you are not familiar with how gans 00:55 work i definitely recommend you to watch 00:57 the video i made explaining it before 00:59 continuing this one 01:01 as you know gans architecture trained in 01:03 an adversarial way 01:04 meaning that there are two networks 01:06 training at the same time 01:07 one training to generate a transform 01:09 damage from the input 01:11 the generator and the other one training 01:13 to differentiate the generated images 01:16 from the training images ground truth 01:18 these training images ground troops are 01:20 just the transformation result we would 01:22 like to achieve for each input image 01:25 then we try to optimize both networks at 01:28 the same time 01:29 thus making the generator better and 01:30 better at generating images 01:32 that look real but in order to produce 01:35 these great 01:36 in realistic results we need two things 01:38 a training data set 01:40 composed of thousands and thousands of 01:42 images 01:43 and stopping the training before 01:44 overfitting overfitting during again 01:47 training will mean that our 01:48 discriminator's feedback will become 01:50 meaningless 01:51 and the images generated will only get 01:53 worse 01:54 it happens past a certain point when you 01:56 train your network too much for your 01:58 amount of data 01:59 and the quality only gets worse as you 02:02 can see happening here 02:03 after the black dots these are the 02:05 problems nvidia tackled 02:07 with this paper they realized that this 02:09 is basically the same problem 02:11 and could be solved by one solution they 02:13 proposed a method they called an 02:15 adaptative discriminator augmentation 02:18 their approach is quite simple in theory 02:20 and you can apply it to any gan 02:22 architecture you already have 02:24 without changing anything as you may 02:26 know 02:27 in most areas of deep learning we 02:29 perform what we call data augmentation 02:31 to fight against over-fitting 02:33 in computer vision it often takes the 02:35 form of applying transformations to the 02:38 image during the training phase 02:39 to multiply our quantity of training 02:41 data these transformations can be 02:44 anything 02:44 from applying a rotation adding noise 02:46 changing the colors etc 02:48 to modify our input image and create a 02:51 unique version of it 02:52 making our network train on way more 02:54 diverse data set 02:56 without having to create or find more 02:58 images 02:59 unfortunately this cannot be easily 03:01 applied to a gun architecture 03:03 since the generator will learn to 03:04 generate images following these same 03:06 augmentations 03:08 this is what nvidia's team has done they 03:11 found a way to use these augmentations 03:12 to prevent the model from overfitting 03:15 while ensuring that none of these 03:16 transformations are leaked onto the 03:18 generated images 03:20 they basically apply this set of image 03:22 augmentations to all images shown to the 03:25 discriminator 03:26 with a chosen probability of each 03:28 transformation to randomly occur 03:30 and evaluate the discriminator's 03:31 performance using 03:33 these modified images this high number 03:36 of transformations all applied randomly 03:38 makes it very unlikely that the 03:40 discriminator sees even 03:42 one unchanged image of course the 03:44 generator is trained and guided to 03:46 generate only clean images 03:47 without any transformations they've 03:50 concluded that this method of training a 03:52 gan architecture with augmented data 03:55 shown to the discriminator works only if 03:57 each transformation's occurrence 03:59 probability is below 80 the higher it is 04:03 the more augmentations will be applied 04:05 and thus a more diverse training data 04:07 set you will have 04:08 they found that while this was solving 04:10 the question of limited 04:11 amount of training images there was 04:13 still the overfitting issue that 04:15 appeared at different times based on 04:17 your initial dataset size 04:19 this is why they thought of an 04:20 adaptative way of doing the segmentation 04:24 instead of having another hyperparameter 04:26 to decide the ideal augmentation 04:28 probability of appearance 04:30 they instead control the augmentation 04:32 strength during the training 04:34 starting at 0 and then adjust its value 04:36 iteratively based on the difference 04:38 between the training and validation 04:40 sets indicating if overfitting is 04:42 happening or not 04:44 this validation set is just a different 04:46 set of the same type of images that the 04:48 network is not trained on 04:51 the validation set just needs to be made 04:53 of images that the discriminator hasn't 04:56 seen before 04:57 it is used to measure the quality of our 04:59 results and quantify the degree of 05:01 divergence of our network 05:03 quantifying overfitting at the same time 05:06 here you can see the results of this 05:08 adaptative discriminator augmentation 05:10 for multiple training set sizes on the 05:12 ffhq dataset 05:14 here we use the fid measure which you 05:17 can see getting better and better over 05:18 time 05:19 and never reaching this overfitting 05:21 problem where it starts to get only 05:23 worse 05:24 the fid or frechette inception distance 05:27 is basically 05:28 a measure of the distance between the 05:30 distributions for generated and real 05:32 images 05:33 it measures the quality of generated 05:35 image samples 05:37 the lower it is the better our results 05:40 this ffhq dataset contains 05:42 7000 high quality faces taken from 05:44 flickr 05:45 it was created as a benchmark for 05:47 generative adversarial networks 05:50 and indeed they successfully matched 05:52 style gun 2 results with an order of 05:54 magnitude fewer images 05:56 used as you can see here where the 05:58 results are plotted for 1 000 to 140 000 06:02 training examples 06:03 using again this same fid measure on the 06:06 ffhq dataset 06:08 of course the code is also completely 06:10 available and easy to implement to your 06:12 gan architecture using tensorflow 06:14 both the code and the paper are linked 06:16 in the description if you would like to 06:18 implement this 06:18 in your code or have a deeper 06:20 understanding of the technique by 06:22 reading the paper 06:23 this paper was just published in the nur 06:25 ips 2020 as well as another announcement 06:28 by nvidia 06:30 they announced a new program called the 06:32 applied research accelerator 06:34 program their goal here is to support 06:36 research projects 06:37 to make a real world impact through 06:39 deployment into gpu accelerated 06:42 applications 06:43 adopted by commercial and government 06:45 organizations 06:46 granting hardware funding technical 06:49 guidance 06:50 support and more to researchers you 06:52 should definitely give it a look if that 06:54 fits your current needs 06:55 i linked it in the description of the 06:57 video as well 06:59 please leave a like if you went this far 07:01 in the video 07:02 and since there are over 90 percent of 07:03 you guys watching that are not 07:05 subscribed yet 07:06 consider subscribing to the channel to 07:08 not miss any further news clearly 07:10 explained 07:11 thank you for watching