Identifying patterns and extracting features on images using deep learning models
SPCA takes in 7,000 to 9,000 animals each year in Singapore. Half of them are abandoned pets like cats, dogs, rabbits, and guinea pigs. It takes time for the pet to be identified and hence prolongs the wait to be listed for adoption.
NParks has a group of volunteers who meet regularly for bird watching activities. At the same time they help collect data on the avian population in Singapore, but not all of them can identify the birds species correctly.
My project goal is to develop a tool for these 2 organisations: to identify animal breeds for SPCA; and to identify avian species for NParks.
This goal can be translated into an image classification problem for deep learning models. So I explored a simple neural network, and then progressed to convolutional neural network and transfer learning.
Simple Neural Network
First I started with image classification using a simple neural network. The dataset is from pyimagesearch, which has 3 classes: cat, dog, and panda. There are 3000 images in total, ie, 1000 for each class.
I set up a simple neural network model with only 1 dense layer in the middle and took about 4 minutes to train the model.
The accuracy achieved was 61% and I was ready to test the model with new images.
I input these images to the model, and the simple neural network model was able to make to classify them according to the highest probability.
For example, the model was 58% sure that this is a panda. But it has legs, so there is a small chance it could be a cat or a dog as well.
I started to be adventurous, however the simple neural network model was unable to make the correct classification. So I trained a convolutional neural network (CNN) model which was able to classify them correctly.
I had to explore further with more challenging images, and the CNN model is well known to be good at image classification.
Convolutional Neural Network
What is a CNN model then?
CNN stands for Convolutional Neural Network, where each image goes through a series of convolution and max pooling for features extraction. I explored using the CIFAR-10 dataset which has 60,000 images divided into 10 classes.
With so many images, it took almost 4 hours to train the model, and achieved an accuracy of 75%.
So I was ready to test the model, using unseen images from Google search. The CNN model was able make the correct prediction most of the time, for example the model was quite sure that this is an airplane, and this is a ship with 72% probability.
And also this is a deer and this is a horse based on the highest predicted probability.
Incidentally there is some chance that this horse could be a deer or a frog, because of certain features picked up by the model. To improve classification accuracy, I need more data. I need to train the model on a larger data set.
Next I explored a huge dataset of over a million images. However, this model would take a long time to train with my limited resources. So I did Transfer Learning to avoid reinventing the wheel. I used the VGG16 pre-trained model developed by University of Oxford, which has 1000 classes ranging from animals to things and food. Oxford has spent a lot of GPU processing power, time and resources to train this model.
Let’s test the model by feeding these images which I have downloaded from Google search (so I know the answers).
These are quite similar images, but the model was able to classify them according to their breed. Notice that the Hush Puppies dog (Basset) on the left has more distinct features, the model was also more certain in its classification.
How about to identify these cats? Oxford has already trained the VGG16 model on many cat species, and the model has no problem classifying them.
How about these birds? Birds seem to have very distinct features, and the model was able to identify their species with very high certainty.
However, not all these birds are found in Singapore.
My next step is to look for many images of common birds and animals found in Singapore to train the model, so as to append to the “knowledge database” of the model. This would help to improve the classification tool for these 2 organisations (SPCA and NParks).
Image classification can be done using neural network models. Identifying patterns and extracting features on images are what deep learning models can do, and they do it very well.
“The model is as intelligent as you train it to be”