Quick recap :
- I was able to get to ~75% accuracy using resnet32 model using ~80 images.
- I wanted to push the boundaries of this result using data augmentation,adding more data and may be better model selection.
Taking it forward from part-1. I decided to do the following:
Step-1:Try to identify if there are ways I can add more images belonging to the either classes.On second thoughts I kind of decided against that, as I wanted this to be a limited data problem.
Step-2:Improving data augmentation was my key tactic.I changed the image size to 512 X 512 and zoomed it to 2.0 times its original size.The intuition behind that was as was observed of the images the model was not able to identify correctly there was high level of detailing which we want the model to train on.For this higher zoom should give model a better chance of getting the details correct.If you look at the below images all of them are confusing in themselves and hence tagged wildfires as others, a higher zoom should allow the DL algorithm to look into the details of an image.
Step-3:Change the model architecture.I moved myself from Resnet32 to resnet50 .The intuition behind that was as this is a complicated problem with limited number of training samples we will need more number of layers to account for higher level of abstraction.
Post this I proceeded with the other steps of model development LR finder,Training of final layers and then retraining weights of all the layers.
I needed to change the learning rate to even lower than what the learning rate finder indicated,if not, the model was not able to hit the function minima.Once this was done,below are how the final results look like.
Voila!I was able to hit an accuracy of ~81.25%.This is an increment of almost 600 bps over the last training using resnet32.
Learnings from the mini project:
- When in doubt try to simplify things.Try to remove levels of abstractions and think from fundamentals(A complicated problem may need a complicated solution for example moving to resnet50 from resnet32)
- If your training sample may confuse a human it may confuse a machine or algorithm too.If there are chances that a human will miss-represent a snow storm from a wild fire in a satellite image then even a machine will.What will a human do?A human will take time to define his conclusion(low learning rate)and further looking into the details of image may help(zooming into the image)
Hence,the exercise here reemphasizes the fact that we need to know our data or problem area well before any successful predictive algorithms can be developed.