_Featured_: interpolation, t-SNE projection (with gifs & examples!)\n\n!(https://hackernoon.com/hn-images/1*vEZE5VcjUr5RUbt_OWfR_w.gif)\n\nIn the “_Deep Learning bits_” series, we will **not** see how to use deep learning to solve complex problems end-to-end as we do in [**_A.I. Odyssey_**](https://medium.com/@juliendespois/talk-to-you-computer-with-you-eyes-and-deep-learning-a-i-odyssey-part-2-7d3405ab8be1)**_._** We will rather look at different techniques, along with some **examples and applications.** Don’t forget to check out [_Deep Learning bits #1_](https://hackernoon.com/autoencoders-deep-learning-bits-1-11731e200694)!\n\n> **_If you like Artificial Intelligence, make sure to_** [**_subscribe to the newsletter_**](http://eepurl.com/cATXvT) **_to receive updates on articles and much more!_**\n\n### Introduction\n\n[Last time](https://hackernoon.com/autoencoders-deep-learning-bits-1-11731e200694), we have seen what autoencoders are, and how they work. Today, we will see how they can help us **visualize the data** in some _very_ cool ways. For that, we will work on images, using the Convolutional Autoencoder architecture (_CAE_).\n\n#### What’s the latent space again?\n\nAn autoencoder is made of two components, here’s a quick reminder. The **_encoder_** brings the data from a high dimensional input to a **_bottleneck_** layer, where the number of neurons is the smallest. Then, the **_decoder_** takes this encoded input and converts it back to the original input shape — in our case an image. The **_latent space is_** the space in which the data lies in the bottleneck layer.\n\n!(https://hackernoon.com/hn-images/1*op0VO_QK4vMtCnXtmigDhA.png)\n\nConvolutional Encoder-Decoder architecture\n\nThe latent space contains a **compressed** representation of the image, which is **the only information** the decoder is allowed to use to try to reconstruct the input **as faithfully** **as possible**. To perform well, the network has to learn to extract the **most relevant** features in the bottleneck.\n\n_Let’s see what we can do!_\n\n### The dataset\n\nWe’ll change from the datasets of last time. Instead of looking at [my eyes](https://hackernoon.com/talk-to-you-computer-with-you-eyes-and-deep-learning-a-i-odyssey-part-2-7d3405ab8be1#.scd7s8ej4) or [blue squares](https://hackernoon.com/autoencoders-deep-learning-bits-1-11731e200694#.6qgkt12jm), we will work on probably the _most famous for computer vision:_ the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset of _handwritten digits_. I usually prefer to work with **less conventional** datasets just for diversity, but MNIST is **really convenient** for what we will do today.\n\n**_Note:_** Although MNIST visualizations are _pretty common_ on the internet, the images in this post are 100% generated **from the code,** so you can use these techniques with your own models.\n\n!(https://hackernoon.com/hn-images/1*vuZioaSR5BJSGqoghottfw.png)\n\nMNIST is a labelled dataset of 28x28 images of handwritten digits\n\n### Baseline — Performance of the autoencoder\n\nTo understand what kind of features the encoder is capable of extracting from the inputs, we can first look at **reconstructed of images.** If this **sounds familiar**, it’s normal, we already did that last time. However, this step is **necessary** because it sets the baseline for our _expectations_ of the model.\n\n**_Note:_** For this post, the bottleneck layer has only **32 units**, which is some _really_, _really_ brutal dimensionality reduction. If it was an image, it **wouldn’t even be 6x6** pixels.\n\n!(https://hackernoon.com/hn-images/1*oHdV8gmgw0wnCqGrOydzWA.png)\n\nEach digit is displayed next to its blurry reconstruction\n\nWe can see that the autoencoder **successfully** reconstructs the digits. The **reconstruction is blurry** because the input is **compressed** at the bottleneck layer. The reason we need to take a look at _validation samples_ is to be sure we are not _overfitting_ the training set.\n\n**_Bonus_**: _Here’s the training process animation_\n\n!(https://hackernoon.com/hn-images/1*4zP8u3RUYJVwbEj74xkFyQ.gif)\n\nReconstruction of **training**(left) and **validation**(right) samples at each step\n\n### t-SNE visualization\n\n#### What’s t-SNE?\n\nThe first thing we want to do when working with a dataset is to **visualize** the data in a _meaningful_ way. In our case, the **image** _(or pixel)_ **space** has 784 dimensions (28_\\*28\\*1_), and we clearly _cannot_ plot that. The challenge is to squeeze all this dimensionality into something we can grasp, in _2D_ or _3D_.\n\nHere comes [t-SNE](http://jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf), an algorithm that maps a **high dimensional space** to a **2D or 3D space,** while trying to **keep the distance** between the points **the same**. We will use this technique to plot embeddings of our dataset, _first_ directly from the **image space**, and _then_ from the **smaller** **latent space**.\n\n**_Note:_** _t-SNE is better for visualization than it’s cousins_ [_PCA_](http://www.cs.cmu.edu/~elaw/papers/pca.pdf) _and_ [_ICA_](http://www2.hawaii.edu/~kyungim/papers/baek_cvprip02.pdf)_._\n\n#### Projecting the pixel space\n\nLet’s start by plotting the t-SNE embedding of our dataset (from image space) and see what it looks like.\n\n!(https://hackernoon.com/hn-images/1*IPmyksaiYuo0fflQSUB4BQ.png)\n\nt-SNE projection of **image space** representations from the validation set\n\nWe can already see that some numbers are _roughly_ **clustered** together. That’s because the dataset is really simple\\*, and we can use simple _heuristics_ on pixels to classify the samples. Look how there’s no cluster for the digits **8, 5, 7 and 3**, that’s because they are all made of the **same pixels**, and only minor changes differentiates them.\n\n_\\*On more complex data, such as_ [_RGB images_](https://www.cs.toronto.edu/~kriz/cifar.html)_, the_ **_only_** **_clusters_** _would be of images of the_ **_same general color_**_._\n\n#### Projecting the latent space\n\nWe know that the _latent space_ contains **a** **simpler representation** of our images than the pixel space**,** so we can hope that t-SNE will give us an interesting **2-D projection of the latent space**.\n\n!(https://hackernoon.com/hn-images/1*_RLj3E4Lt8cZzlwtmcbqlA.png)\n\nt-SNE projection of **latent space** representations from the validation set\n\nAlthough _not perfect_, the projection shows **denser** clusters. This shows that in the latent space, the same digits are close to one another. We can see that the digits **8, 7, 5 and 3** are now easier to distinguish, and appear in _small_ clusters.\n\n### Interpolation\n\nNow that we know what **level of detail** the model is capable of extracting, we can _probe_ the structure of the latent space. To do that, we will compare how **interpolation** looks in the _image space_, versus _latent space_.\n\n#### Linear interpolation in image space\n\nWe start off by taking **two images from the dataset**, and linearly interpolate between them. Effectively, this _blends_ the images in a kind of **ghostly** way.\n\n!(https://hackernoon.com/hn-images/1*PCEH9lP4BvXxZ1U4ikqgCQ.png)\n\n!(https://hackernoon.com/hn-images/1*AmMmPP2KTzw_MEbdRAS3hw.png)\n\n!(https://hackernoon.com/hn-images/1*kT5ZbcJuLejn67HsWVADcQ.png)\n\nInterpolation in **pixel space**\n\nThe reason for this messy transition is the **structure of the pixel space itself.** It’s simply not possible to go smoothly from one image to another in the image space. This is the reason why blending the image of _an empty glass_ and the image of an _full glass_ will not give the image of a _half-full glass_.\n\n#### Linear interpolation in latent space\n\nNow, let’s do the same in the latent space. We take the same start and end images and **feed them to the encoder** to obtain their _latent space representation._ We then interpolate between the two latent vectors, and feed these to the **decoder**.\n\n!(https://hackernoon.com/hn-images/1*bOdlSOh6MfMJb0s-7oIW9Q.png)\n\n!(https://hackernoon.com/hn-images/1*uNpKu6L8ekkbSdiJLeTRZA.png)\n\n!(https://hackernoon.com/hn-images/1*M6o7VPlxQL4IsxBLUgb37w.png)\n\nInterpolation in **latent space**\n\nThe result is much **more convincing**. Instead of having a _fading_ _overlay_ of the two digits, we clearly see the shape slowly _transform_ from one to the other. This shows how well the latent space **understands the structure** of the images.\n\n**_Bonus:_** here’s a few animations of the interpolation in both spaces\n\n!(https://hackernoon.com/hn-images/1*o3UpMhqHRO4xQiQxThrNRQ.gif)\n\n!(https://hackernoon.com/hn-images/1*ai5BR1i9qnHPfrKWBy01wQ.gif)\n\n!(https://hackernoon.com/hn-images/1*vR5kbTi1Icqg8_g6dSE43A.gif)\n\n!(https://hackernoon.com/hn-images/1*vRiXZd55FUkEzjVKql-2qw.gif)\n\nLinear interpolation in **image space** (left) and **latent space** (right)\n\n### More techniques & examples\n\n#### Interpolation examples\n\nOn **richer** datasets, and with **better** model, we can get _incredible_ visuals.\n\n!(https://hackernoon.com/hn-images/1*N-sWs5uTF1IYh0DkZlhmyg.jpeg)\n\n3-way **Latent space** interpolation for **faces**\n\n!(https://hackernoon.com/hn-images/1*vEZE5VcjUr5RUbt_OWfR_w.gif)\n\nInterpolation of [**3D shapes**](http://3dgan.csail.mit.edu)\n\n#### Latent space arithmetics\n\nWe can also do **arithmetics** in the latent space**.** This means that **instead of** **interpolating, we can add or subtract** latent space representations.\n\n_For example with faces, man with glasses - man without glasses + woman without glasses = woman with glasses._ This technique gives mind-blowing results.\n\n!(https://hackernoon.com/hn-images/1*yMFJ-7fokU0Xkx89pSFfew.gif)\n\nArithmetics on [**3D shapes**](http://3dgan.csail.mit.edu)\n\n**_Note:_** I’ve put a function for that in the code, but it looks terrible on MNIST.\n\n### Conclusions\n\nIn this post, we have seen several techniques to visualize the **learned** features _embedded_ in the latent space of an autoencoder neural network. These visualizations help understand _what_ the network is learning. From there, we can exploit the latent space for **_clustering_**, **_compression_**, and many other applications.\n\n> **_If you like Artificial Intelligence, make sure to_** [**_subscribe to the newsletter_**](http://eepurl.com/cATXvT) **_to receive updates on articles and much more!_**\n\nYou can play with the code over there:\n\n[**GitHub - despoisj/LatentSpaceVisualization: Visualization techniques for the latent space of a…** \n_LatentSpaceVisualization - Visualization techniques for the latent space of a convolutional autoencoder in Keras_github.com](https://github.com/despoisj/LatentSpaceVisualization "https://github.com/despoisj/LatentSpaceVisualization")(https://github.com/despoisj/LatentSpaceVisualization)\n\nThanks for reading this post, stay tuned for more !