How it all began / The Landscape
Think of the typical and well-studied neural networks (such as image classifier) as a left hemisphere of the neural network technology. With this in mind, it is easy to understand what is Generative Adversarial Network. It is a kind of right hemisphere — the one that is claimed to be responsible for creativity.
The Generative Adversarial Networks (GANs) are the first step of neural networks technology learning creativity. Typical GAN is a neural network trained to generate images on the certain topic using an image dataset and some random noise as a seed. Up until now images created by GANs were of low quality and limited in resolution. Recent advances by NVIDIA showed that it is within a reach to generate photorealistic images in high-resolution and they published the technology itself in open-access.
Examples of GAN images. Some are good, some are bad.
There is a plethora of GANs types of various complexity, architectures, and strange acronyms. We are mostly interested here in conditional GANs and variational autoencoders. Conditional GANs are capable of not just mimicking the broad type of images as “bedroom”, “face”, “dog” but also dive into more specific categories. For example, the Text2Image network is capable of translation textual image description into the image itself.
By varying random seed that is concatenated to the “meanings” vector we are able to produce an infinite number of birds image, matching description.
Let’s just close your eyes and see the world in 2 years. Companies like NVIDIA will push GAN technology to industry-ready level, same as they did with celebrities faces generation. This means, that a GAN will be able to generate any image, on-demand, on-the-fly based on textual (for example) description. This will render obsolete a number of photography and design related industries. Here’s how this will work.
Again, the network is able to generate an infinite number of images by varying random seed.
And here’s the scary part. Such a network can receive not only description of the target object it needs to generate, but also a vector describing you — the ad consumer. This ad can have a very deep description of your personality, web browsing history, recent transactions, and geolocation, so the GAN will generate one-time, unique and, that fits you perfectly. CTR is going sky high.
By measuring your reactions the network will adapt and make ads targeting you more and more precisely, hitting your soft spots.
So, at the end of the day, we are going to see a fully personalized content everywhere on the Internet.
Everyone will see fully custom versions of all content, that is adapted to the consumer based on his lifestyle, opinions, and history. We all witnessed arousal of this Bubble pattern after latest USA elections and it’s gonna be getting worse. GANs will able to target content precisely to you with no limitations of the medium — starting from image ads and up to complex opinions, tread and publications, generated by machines. This will create a constant feedback loop, improving based on your interactions. And there is going to be a competition of different GANs between each other. Kind of a fully automated war of phycological manipulations, having humanity as a battlefield. The driving force behind this trend is extremely simple — profits.
And this is not a scary doomsday scenario, this actually is happening today.
I have no idea. But surely we need few things: broad public discussions about this technology inevitable arrival and a backup plan to stop it. So, it’s better to start thinking now — how we can fight this process and benefit from it at the same time.
We are not there yet due to some technical limitation. Up until recently images generated by GANs were just of bad quality and easily spotted as fake. NVIDIA showed that it is actually doable to generate 1024x1024 extremely real faces. To move things forward we would need faster and bigger GPUs, more theoretical studies on GAN, more smart hacks around GAN training, more labeled datasets, etc.
Please, notice — we don’t need new power sources, quantum processors (but they can help), general AI to reach this point or some other purely theoretical new cool things. All we need is within a reach of few years and likely big corp already have this kind of resources available.
Also, we will need smarter neural networks. I am definitely looking for progress in capsules approach by Hinton et al. And of course, we will be the first to implement this in super-resolution technology, that should heavily benefit from GAN progress.
Let me know what you think.