Magic mirror Ever wonder what you would look like if you were a girl? Imaging this. I jump out of bed and look in a mirror. I am a blond! You ask: “That is what you would look like as a girl?” I Say: “YES OMG YES YES YES! This is what I’ve always wanted! The magic mirror is powered by StarGAN, a unified generative adversarial network for multi-domain image-to-image translation. This post will show you how the model works and how you can build the magic mirror. Enjoy the YouTube demo . here Complete source code available on my page. GitHub StarGAN intro Image-to-image translation is to change a particular aspect of a given image to another, e.g., changing the gender of a person from male to female. This task has experienced significant improvements following the introduction of generative adversarial networks (GANs), with results ranging from generating photos from edge maps, changing the seasons of scenery images, and reconstructing photo from Monet’s painting. Images generated with GANs Given training data from two different domains, these models learn to translate images from one domain to the other in a unidirectional way. For example, one generative model is trained to translate a person with black hair to blond hair. Any single existing GAN model is incapable of translating “backward”, like in the previous example from blond to black colored hair. Besides, a single model cannot handle flexible multi-domain image translation tasks. Like a configurable translation of both gender and hair colors. That is where StarGAN stands out, a novel generative adversarial network that learns the mappings among multiple domains using only a single generator and a discriminator, training effectively from images of all domains. Instead of learning a fixed translation (e.g., black-to-blond hair), StarGAN’s model takes both image and domain information as inputs and learns to translate the input image into the corresponding domain flexibly. The pre-trained StarGAN model consists or two networks like other GAN models, generative and discriminative networks. While it is only necessary to have the generative network to build the magic mirror, it is still useful to understand where the complete model comes. The generative network takes two pieces of information as input, the original RGB image with 256 x 256 resolution, and the target labels to generates a fake image with the same resolution, the discriminative network learns to distinguish between real and fake images and classify the real images to its corresponding domain. The pre-trained model we are going to use was trained on the CelebA datasets which contain 202,599 face images of celebrities, each annotated with 40 binary attributes, while the researchers selected seven domains using the following attributes: hair color (black, blond, brown), gender (male/female), and age (young/old). StarGAN Building the magic mirror The researchers of StarGAN have published their on GitHub where our magic mirror project based. I was also my first time dealing with the PyTorch framework, so far it’s going well. If you are new to the PyTorch framework like me, you will find it quite easy to get started work with especially with the experience of another deep learning framework like Keras or TensorFlow. code Only the most basic of the PyTorch framework knowledge is required to accomplish the project, like PyTorch tensor, loading predefined model weights etc. Let’s starts by installing the framework. In my case, on Windows 10 which is officially supported by the latest PyTorch. To enable the magic mirror run in real-time with minimal perceivable lags, accelerate the model execution with your gaming PC’s Nvidia graphics card if you have one. Install CUDA 9 from . this link on the Nvidia Developer website Install CUDA 9 After that install PyTorch with CUDA 9.0 support following instructions. its official website Install PyTorch When PyTorch and other Python dependencies are installed, we are ready for the code. To implement a simple real-time face tracking and cropping effect, we are going to use the lightweight module from Python’s OpenCV library. This module takes a grayscale image transformed from a webcam frame and returns detected faces’ bounding boxes information. In case multiple faces are detected in a given frame, we will take the “main” face with the largest computed bound box area. CascadeClassifier Since the StarGAN generative network expects images where their pixels values range between -1 to 1 instead of 0 to 255, we are going to have PyTorch’s built-in image transform utility to handle the image preprocessing. The generative network subclasses PyTorch’s which means you can call it directly by passing in the input tensors as arguments. nn.Module The variable is a PyTorch tensor with 5 values each one set to either 0 or 1 to indicate the 5 target labels. labels [‘Black_Hair’, ‘Blond_Hair’, ‘Brown_Hair’, ‘Male’, ‘Young’] For example, we want to transform a portrait to blond haired young female. The 's value will be set to [0, 1, 0, 0, 1]. labels To show the generated image tensor with cv2’s imshow() function, here is what it looks like a single line of code. And there is the breakdown, First move the image data from GPU to CPU by calling . cpu() Use call to detect it from the graph. detach() call returns the tensor value as a Numpy array. numpy() The first [0] takes the first image out of the generated batch, (even though the batch size is one). Swap the axis to turn a (3, 256, 256) shaped array into (256, 256, 3). Recover the pixel values from range -1~1 to 0~1. Flip the generated image horizontally with the operation. ::-1 Turn image channels order from RGB to BGR with the last operation as function expects an image in BGR channels order. ::-1 cv2.imshow() Wrapping the code into a single function call which takes several optional arguments. MagicMirror() videoFile: leave the default value 0 to use the first web camera, or pass in a video file path. setHairColor: one of the three, “black”, “blond”, “brown”. setMale: transform into a male? Set to True or False. setYoung: transform into a young person? Set to True or False. showZoom: default to 4, this factor by which to resize the generated image up before showing on the screen. Conclusion and Further thought This tutorial shows you how easy and fun it could be to pick up a new framework like PyTorch and build something interesting with a pre-trained StarGAN network. The images generated might not look super realistic yet while the shows a model jointly trained with both the CelebA + RaFD datasets can generate images with fewer artifacts by leveraging both datasets to improve shared low-level tasks such as facial keypoint detection and segmentation. You can follow along with to download both datasets and train such a model as long as you have a beefy machine and a extra week to run the training. StarGAN paper their official GitHub Share on Twitter Share on Facebook Originally published at www.dlology.com .

The Graph

Facebook

NVIDIA

Super

Target

Twitter

YouTube

How to run PyTorch with GPU and CUDA 9.2 support on Google Colab

How to train neural network on browser

Too Long; Didn't Read

“If I were a girl” — Magic Mirror by StarGAN

“If I were a girl” — Magic Mirror by StarGAN

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Can you trust a Keras model to distinguish African elephant from Asian elephant?

It Is Okay If You Don't Know What You Like. We Do (feat. Deep Recommendation Algorithms)

10 Machine Learning, Data Science, and Deep Learning Courses for Programmers in 2020

10 Computer Vision Startups on Product Hunt with the Most Upvotes

10 Best Entry Level Machine Learning Tutorials

10 Best + Free Machine Learning Courses Collection

Can you trust a Keras model to distinguish African elephant from Asian elephant?

It Is Okay If You Don't Know What You Like. We Do (feat. Deep Recommendation Algorithms)

10 Machine Learning, Data Science, and Deep Learning Courses for Programmers in 2020

10 Computer Vision Startups on Product Hunt with the Most Upvotes

10 Best Entry Level Machine Learning Tutorials

10 Best + Free Machine Learning Courses Collection

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps