Have you ever dreamed of being able to edit any part of a picture with quick sketches or suggestions? Or maybe you wanted to change specific features like the eyes, eyebrows or someone in an image, or even the wheels of your car? Well, it is not only possible, but it is now easier than ever with this new model called EditGAN, and the results are really impressive! Control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same! SOTA Image Editing from sketches model based on GANs by NVIDIA, MIT and UofT. Watch more results and learn how it works in the video! Watch the video References ►Read the full article: ► Paper: Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A. and Fidler, S., 2021, May. EditGAN: High-Precision Semantic Image Editing. In Thirty-Fifth Conference on Neural Information Processing Systems. ► Code and interactive tool (arriving soon): ►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/editgan/ https://nv-tlabs.github.io/editGAN/ https://www.louisbouchard.ai/newsletter/ Video Transcript 00:00 have you ever dreamed of being able to 00:01 edit any part of a picture with quick 00:04 sketches or suggestions well it's not 00:06 only possible but it has never been 00:08 easier than now with this new model 00:11 called edit gun and the results are 00:13 really impressive you can basically 00:15 improve or mimify any picture super 00:18 quickly indeed you can control whatever 00:20 feature you want from quick drafts and 00:22 it will only edit the modifications 00:24 keeping the rest of the image the same 00:27 control has been sucked and extremely 00:29 challenging to obtain with image 00:30 synthesis and image editing ai models 00:33 like guns you can see how having extra 00:35 control is useful for image editing and 00:38 how it improves the quality of the work 00:40 you create when running machine learning 00:42 projects the quality of the work you 00:44 produce is directly correlated with the 00:46 quality of your tools and the level of 00:48 control they give thankfully for us it's 00:50 easier to take control of your machine 00:52 learning projects using this episode's 00:54 sponsor weights and biases by tracking 00:57 all of the input hyper parameters output 01:00 metrics and any insights that you or 01:02 your team have you know that your work 01:04 is saved and under control one aspect 01:06 that i love for teams is weights and 01:08 biases reports i love how i can easily 01:10 capture all of my projects charts and 01:12 findings in reports that i can share 01:15 with my team and get feedback the charts 01:17 are interactive and tracked with weight 01:19 and biases so i know my work is 01:21 reproducible i feel lucky that i get to 01:23 spend time trying to make research look 01:25 simple and clear for you all and that 01:28 weights and biases is trying to do the 01:30 same with their platform i'd love for 01:32 you to check them out with the first 01:33 link below because they are helping me 01:35 to continue making these videos and grow 01:37 this channel 01:39 as we said this fantastic new paper from 01:42 nvidia the university of toronto and mit 01:45 allows you to edit any picture with 01:47 superb control over specific features 01:50 from sketch inputs typically controlling 01:52 specific features requires huge data 01:55 sets and experts to know which features 01:57 to change within the model to have the 02:00 desired output image with only the 02:02 wanted features changed instead it again 02:05 learns through only a handful of 02:07 examples of labeled images to match 02:10 segmentation to images allowing you to 02:13 edit the images with segmentation or in 02:15 other words with quick sketches it 02:17 preserves the full image quality while 02:20 allowing a level of detail and freedom 02:22 never achieved before this is such a 02:24 great jump forward but what's even 02:26 cooler is how they achieve that so let's 02:29 dive a bit deeper into their model first 02:31 the model uses talgen 2 to generate 02:34 images which is the best image 02:36 generation model available at the time 02:38 of the publication and is widely used in 02:40 research i won't dive into the details 02:42 of this model since i already covered it 02:44 in numerous videos with different 02:46 applications if you'd like to learn more 02:48 about it instead i will assume you have 02:50 a basic knowledge of what style gun 2 02:52 does take an image encode it into a 02:55 condensed subspace and use a type of 02:58 model called a generator to transform 03:00 this encoded subspace into another image 03:03 this also works using directly encoded 03:05 information instead of encoding an image 03:08 to obtain this information what's 03:10 important here is the generator as i 03:12 said it will take information from a 03:14 subspace often referred to as latent 03:17 space where we have a lot of information 03:19 about our image and its features but the 03:22 space is multi-dimensional and we can 03:24 hardly visualize it the challenge is to 03:26 identify which part of the subspace is 03:28 responsible for reconstructing which 03:30 feature in the image this is where 03:32 editgan comes into play not only telling 03:35 you which part of the subspace does what 03:37 but also allowing you to edit them 03:39 automatically using another input a 03:42 sketch that you can easily draw indeed 03:44 it will encode your image or simply take 03:46 a specific latent code and generate both 03:49 the segmentation map of the picture and 03:51 the picture itself this means that both 03:53 the segmentation and images are in the 03:55 same subspace by training a model to do 03:58 that and it allows for the control of 04:00 only the desired features without you 04:02 having to do anything else as you simply 04:05 need to change a segmentation image and 04:07 the other will follow the training will 04:09 only be on this new segmentation 04:11 generation and the style gun generator 04:13 will stay fixed for the original image 04:15 this will allow the model to understand 04:17 and link the segmentations to the same 04:20 subspace needed for the generator to 04:22 reconstruct the image then if trained 04:24 correctly you can simply edit the 04:26 segmentation and it will change the 04:28 image accordingly edit can will 04:30 basically assign each pixel of your 04:32 image to a specific class such as head 04:34 ear eye etc and control these classes 04:38 independently using masks covering the 04:40 pixels of other classes within the 04:42 latent space so each pixel will have its 04:45 label and edit gun will decide which 04:48 label to edit instead of which pixel 04:50 directly in the latent space and we 04:52 construct the image modifying only the 04:55 editing region and voila by connecting a 04:58 generated image with a segmentation map 05:00 edit gun allows you to edit this map as 05:03 you wish and apply these modifications 05:05 to the image creating a new version of 05:08 course after training with these 05:09 examples it works with unseen images and 05:12 like all guns the results are limited to 05:14 the kind of images it was trained with 05:16 so you cannot use this model on images 05:18 of cats if you trained it with images of 05:21 cars still it's quite impressive and i 05:23 love how researchers try to provide ways 05:26 to play with gans intuitively like using 05:28 sketches instead of parameters the code 05:31 isn't available for the moment but it 05:33 will be available soon and i'm excited 05:34 to try it out this was just an overview 05:37 of this amazing new paper and i will 05:39 strongly invite you to read their paper 05:41 for a deeper technical understanding let 05:43 me know what you think and i hope you've 05:44 enjoyed this video as much as i enjoyed 05:47 learning about this new model thank you 05:49 once again to weights and biases for 05:50 sponsoring the video and to you that is 05:52 still watching see you next week with a 05:55 very special and exciting video about 05:57 the subject i love 05:58 [Music] 06:11 you