The idea is pretty simple — make a neural network to generate stunning visual effects in sync with the music. This is totally inspired by NVIDIA’s faces generator. For those of you who are not aware, NVIDIA released a great research article and a code for their Generative Adversarial Network (GAN) in 2017. This was the first example of GAN that was able to generate high-resolution, high-fidelity images of celebrities. Here’s how it looks.
I wouldn’t explain here what is GAN and how it works — there is a huge amount of great tutorials and explanations on the web.
I’ve made few attempts to do this. First, I build my clone of Nvidia’s network with MXNET as a neural framework, using a few modern techniques, that were unavailable at the time (like relativistic GAN). I didn’t work as well as I expected. If you’d like to check it out – join my GitHub.
Anyway, the easy way is a way to go, right? So, I came back to original NVIDIA code and instead of using celebrities faces I just dropped in a part of WikiArt archives — I used abstract expressionists paintings from 80s and 90s as a learning template. That made sense in terms of simplicity to reproduce — there’s no need for photorealism or whatever. I added some random cropping, flipping, color things to augment dataset of about 2500 images and left it to train for about 5days.
The result came out to be pretty well, next question was to sync crazy art with music. There’s definitely lot’s of much better solutions out there, but mine works as well. I just did fast Fourier transformation, extracted 512 frequencies and made that a latent vector input to the neural net. Here’s the result.