Among individuals with diabetes, the prevalence of diabetic retinopathy is approximately 28.5% in the United 12 States and 18% in India. Globally, the number of people with DR will grow from 126.6 million in 2010 to 191.0 million by 2030. This disease is one of the most frequent causes of visual impairment in developed countries and is the leading cause of new cases of blindness in the working age population.
Altogether, nearly 75 people go blind every day as a consequence of DR even though treatment is available.
Diabetic retinopathy is a complication of diabetes, caused by high blood sugar levels damaging the back of the eye (retina). It can cause blindness if left undiagnosed and untreated. When left untreated, diabetic retinopathy damages your retina. This is the lining at the back of your eye that transforms light into images. World Health Organization(WHO) estimates that around 300 million will suffer from diabetes by the year 2025.
A normal retina
A retina showing signs of diabetic retinopathy.
Regarding decision making, automatic DR screening systems either partially follow clinical protocols.
Automated grading of diabetic retinopathy has potential benefits such as increasing efficiency, reproducibility, and coverage of screening programs; reducing barriers to access; and improving patient outcomes by providing early detection and treatment. To maximize the clinical utility of automated grading, an algorithm to detect referable diabetic retinopathy is needed.
We will be developing an automated system for classification of Diabetic Retinopathy leveraging the power of machine learning.
The approach classifies images based on characteristic features extracted by lesion detection and anatomical part recognition algorithms. Considering the specificity of the screening as a matter of efficiency, we show how both sensitivity and specificity can be kept at high level by combining novel screening features and a decision-making process.
DR Classification Workflow
I have got the desired dataset from UCI Machine Learning Repository. It consists of 1151 instances and 20 attributes. All features represent either a detected lesion, a descriptive feature of a anatomical part or an image-level descriptor. We have segment the instances of diabetic retinopathy patients from the ones who are not affected by it. Below is the list of attributes we have considered for the patient segmentation and detailed description of the features, refer this paper : -
Starting off, The dataset had no missing values. The initial parameters were image quality assessment and pre-screening. For image quality assessment, we have assumed that all the features extracted from the image processing algorithms were of sufficient quality for every patient and 4 of them being the exception cases. Our assumption was homogeneity of all the cases and ignoring the 4 exceptions out of 1147 others.
Pre- Screening results gave us 92% of the cases were patient’s had severe retinal abnormality while rest of them were lacking in it.
After evaluating rest of the features, we had come up to a conclusion that only two set of features were holding high importance. The First set consisted of 6 features of Micro aneurysm detection at confidence levels with alpha ranging from 0.5 to 1.0.
Normal Retina V/s Micro aneurysm detected Retina
Less number of microaneurysms are detected as the confidence level increases.
As the confidence level gradually decreases, we can observe a higher number of microaneurysms detected.
While the Second Set consists 8 features of Micro aneurysm detection for Exudates at confidence levels with alpha ranging from 0.5 to 1.2.
As the confidence level increases, the number of exudates detected decreases.
I have tried and tested a number of baseline classifiers for this purpose namely Naive Bayes, Random Forest, Support Vector Machines and Decision Tree(CART). These classifiers failed to give us satisfactory results.
For medical decision support, ensemble methods have been successfully applied to several fields. Upon experimenting with the parameters of AdaBoost and Gradient Boosting Machine. There was a sudden increase in terms of accuracy but our model performance is based in terms of specificity and sensitivity where no significant improvement was observed.
At last, with use of Multilayer Perceptron Neural Network we were able to achieve a decent accuracy of 78.56%. Our model performance currently is 74.3% in terms of sensitivity and 82.6% in terms of specificity.
General Architecture of Multilayer Perceptron
I have used L-BFGS for solver, it converges faster and with better solutions on small datasets. It does not support mini batch learning.
Multi-layer Perceptron is sensitive to feature scaling, so I have used standard scaling for best optimal results.
Different weight initializations can lead to different validation accuracy. MLP with hidden layers have a non-convex loss function.
The ROC curve shows the tradeoff between these two across different settings of the classifier process, and that is useful in understanding something about the performance of the classifier.
ROC Curve
The above ROC curve shows the tradeoff between sensitivity and specificity which are our primary parameters. The area under the curve is ‘AUC:’, 0.2154.
A highly sensitive test means that there are few false negative results, and thus fewer cases of the disease are missed.
A highly specific test means that there are few false positive results. It may not be feasible to use a test with low specificity for screening, since many people without the disease will screen positive, and potentially receive unnecessary diagnostic procedures.
Opposite to the state-of-the-art methods, I have used image- level, lesion-specific and anatomical components at the same time. My approach has been validated on the publicly available on Messidor Dataset.
The sensitivity/specificity results (74%/83%) we have achieved are also close to the recommendations of the British Diabetic Association (BDA) (80%/95%) for DR screening (Bda, 1997)
These results strengthen the idea that MLP can be used efficiently as a classifier for detecting eye related diseases in fundus images. Even with such results and progress, our network won’t give desired results in case the exudates areas at a particular section in fundus exceed that of optical disc. With these limitations and results, work should be carried on to derive several more features and develop more efficient systems or a more appropriate approach could be to use Convolutional Neural Network on the image dataset of Messidor instead of applying image processing algorithms .
If you have followed the blog till here and interested to know about the code, please visit this link. Any sort of recommendations and improvements in any form are highly appreciated. Thanks for reading !