Zack Thoutt

@zthoutt

What are Creative Adversarial Networks (CANs)?

September 23rd 2017
Artwork generated by CAN

One of the most difficult things for humans to program/teach computers to do is think creatively. Computers are extremely good at doing exactly what we tell them to do, and quickly, but creativity is an abstract concept and teaching creativity to machines has proven to be a difficult machine learning challenge.

In June a research paper out of Rutgers was released introducing the world to the idea of Creative Adversarial Networks (CANs). As you can see from the images above, the CAN they trained did an exceptional job at both creating something that looks like it was made by a real artist and something that is unique and not a close mirror of artwork that already exists.

These are probably the best results ever obtained by machines thinking creatively.

CANs are GANs that can think creatively

CANs are based off of Generative Adversarial Networks (GANs), which were created by Ian Goodfellow and some of his coworkers a few years ago. In order to understand CANs, you need to understand GANs.

GANs are a type of neural network that actually consists of two neural networks — a generator and a discriminator.

The discriminator’s job is to take images as input and determine whether they are real or fake (i.e. created by humans or by the generator).

The generator’s job is to generate new images that trick the discriminator into thinking they are real.

By feeding the discriminator a mixture of fake images made by the generator and real images, it learns to recognize patterns that help it classify each image as real or fake. At the same time the generator gets feedback about which images are tricking the discriminator best and how it can alter its strategy to trick it even better.

The competition between the discriminator and generator pushes them both to become better at their job, and the results the generator outputs can look surprisingly real when GANs are trained correctly.

CANs are architected in almost the same way as GANs, but with one key addition that allows the generator to “think” creatively…

The discriminator still tries to learn how to classify each image as real or fake, but it also learns how to classify images into one of 25 artistic styles (i.e. cubism, abstract, renaissance, realism, etc.).

The generator still tries to trick the discriminator into thinking the images it’s generating are real, but it also tries to make it difficult for the discriminator to classify it’s images into one of the 25 artistic styles.

Why do CANs work?

In order to understand why adding the classification of images into styles of art enables the generator to think creatively, we need a concrete definition of creativity that a machine can mimic.

CANs are motivated by the theory that creativity is perceived by viewers when a given piece of art is unique, but not too far out there. I think the researchers at Rutgers described this dynamic well in their paper.

Too little arousal potential is considered boring, and too much activates the aversion system, which results in negative response.

Let’s circle back around and think about how this relates to the CAN architecture. By rewarding the generator for generating images that the discriminator can’t easily classify into one of the artistic styles, it’s forced to generate images that are unique (creative). At the same time, the generator still needs to trick the discriminator into thinking the images are real, so it can’t generate images that are too out there and obviously different.

In this way, CANs simulate this definition of how we view creativity in art.

Art viewers can hardly tell the difference

The table above compares how well four sets of artwork were rated by a sample of human viewers. The DCGAN images are created by a standard GAN (no image classification by artistic style to enable it to think creatively). These images look like real art, but they closely mimic defined art styles. They don’t think creatively.

The CAN set is obviously a set of images generated by a Creative Adversarial Network.

Both the Abstract Expressionist and Art Basel 2016 datasets are collections of modern artwork. The Abstract Expressionist dataset is made up of images created between 1945 and 2007; the Art Basel 2016 is a collection of images unveiled at Art Basel 2016, a flagship contemporary art show.

Impressively, the images from the CAN dataset ranked highest in Novelty, Surprising, Ambiguity, and Complexity. They were also significantly better at deceiving viewers into thinking they were created by a human than the GAN’s images.

Furthermore, the creators of CANs argue that some or most of the remaining difference between CANs and the Abstract Expressionist image sets is a result of the CAN thinking creatively. The images in the Abstract Expressionist dataset are familiar to art viewers since they are older and fit into defined art styles. Contrast that with the Art Basel 2016 image set, which humans actually found toughest to label real or fake correctly.

You could argue that humans found the Art Basel 2016 images tougher to label because they are more creative by our definition. By extension, it’s possible that CANs should ideally create images that are more difficult for humans to classify as real or fake than images from familiar art styles.

Either way, CANs are a huge leap for machines thinking creatively and participating in the arts.

If you want to read the entire CAN research paper from Rutgers, you can find it here.

Call to action

I’m currently on a quest to discover how to create side projects that generate passive income. If you want to stay up-to-date with my writing, ideas, and projects, you can find more info and links to all of my social channels on my personal website:

zackthoutt.com

More by Zack Thoutt

More Related Stories