Let’s continue our journey to build a travel recommendation engine. You can find the part 1 of the series on my blog. After reading this post, you will know how to turn a model trained for classification into one that extracts image feature vectors. Then we’ll walk through how to compute the similarity between two images with their feature vectors. Finally, we will generate the travel recommendation with the most similar image.
For the best learning experience, I suggest opening the Colab Notebook while reading this tutorial.
The engine we are going to build is a content-based recommendation engine. If a user likes a destination photo, then the system will show him/her a similar travel destination image.
In our previous post, the model was built to classify an input image as one of the 365 place/scene names.
We are going to remove the last 4 layers responsible for place logits generation and only keep the “feature extractor” part of the network.
In Keras we can pop out the last 4 layers like this.
The last line above saves the model weight to a file for later use. Next time we just need to define the model without the 4 classifier layers and initialize the network with the saved weights.
If you run the mode.predict
again, you will notice that the output is no longer a vector of 365 floating point numbers. Instead, it is now a vector of 4096 floating point numbers. This is the feature vector, an abstract representation for the input image.
Our recommending engine takes a query image liked by a user and recommends a similar place.
The similarity between two images is computed by measuring the distance between the two feature vectors.
You can imagine measuring the distance between two feature vectors in a 4096-dimensional space. The smaller the distance, the more similar two images to each other.
We can compute all known images feature vectors at runtime and compare with the queried image’s feature vector. But this will be ineffective since we are basically computing those values again and again. An alternatively faster approach is to pre-compute those feature vectors and store them in memory. During the run-time, we only need to compute the query image’s feature vector if it has not been computed before, that saves a lot of time especially when you have lots of images to compare with.
Here is the function to compute an image’s feature vector by calling the feature extractor model’s predict function.
And we have another function to pre-compute all known images feature vectors and store them into the memory, this only needs to be done once.
Here is the function we execute during runtime to search and display the most similar image to a new query image.
Let’s give it a try by running the following line.
get_similar_photo("images/canyon2.jpg")[0].name
And here is the result, our model recommends a similar photo to our queried image.
If you play with the recommending engine, you may notice it generates wrong recommendations once a while.
There are two reasons,
1. The model was trained for classification and the image feature extractor part of the network was optimized for classifying images to 356 classes, not for distinguishing similar images.
2. The model was trained on image datasets distributed among 365 classes of places. The training set might not have enough images for a particular type of beaches or one place at different seasons.
One solution to the first problem is to use a siamese network with triplet-loss, which is popular in face verification task. The model will be trained to identify if two images are from the same place. You can check out the video introduction on Coursera about this concept, I find it very helpful.
The solution to the second problem is to apply transfer learning to our model by “freezing” some earlier convolutional layers and train the rest of the model parameters with our custom image datasets. Transfer learning is a great way to leverage the general features learned from large image datasets when training a new image model.
Now, you got a taste and likely impressed by the unlimited potential of deep learning as well as getting hands-on building and running a Keras model. The journey to master any technology is not easy, deep learning is no exception. And that is what initially motivates me to create this blog site by sharing and teaching what I have learned along the way to become better at applying deep learning to real-life problems. Don’t hesitate to reach out to me personally if you are looking for a solution or simply saying hello.
Find me on GitHub, LinkedIn, WeChat, Twitter or Facebook.
Originally published at www.dlology.com.