In December of 1895, Wilhelm Röntgen revealed the bones of his wife’s hand in the first X-ray photograph. “I have seen my death,” she said. This breakthrough had an incredible influence on 20th-century medical treatment. And latest Deep Learning advances open up new possibilities in this field.
Deep Learning has found great success in computer vision and other areas. And now it is actively transforming the world of medicine. AI helps doctors make more accurate diagnoses faster.
Today we would like to share our thoughts and investigations into very promising direction: Human in the loop AI for medical image analysis within a single environment — Supervisely.
Our platform allows to manage and annotate data, train NNs, apply them for automatic pre-annotation and then deploy them as API.
Challenges with medical images
IBM researchers estimate that medical images, as the largest and fastest-growing data source in the healthcare industry, account for at least 90 percent of all medical data.
Challenge 1: data privacy
Medical data is still personal and not easy to access. And due to data privacy concerns most of the public health centers are reluctant to share the data.
Challenge 2: size of annotated data
Annotation process is hard to outsource and only expert physicians can analyze medical images. This limitation leads to high costs and to the lack of annotated data.
Challenge 3: quality of annotation tools
Annotation tools, that can be used to extract insights from medical images, are still limited, in most cases publicly unavailable and requiring most analysis to be done manually.
Challenge 4 (consequence of 1 and 2): segmentation challenge
Datasets for segmentation task are typically extremely small compared to large public datasets of common images (COCO, PascalVoc and so on). Due to the size of datasets it is difficult to train very deep neural network architectures. Objects of interest can vary in size, shape and position. In combination with the “soft” boundaries it produces additional problems.
We are going to overcome Challenge 3 and Challenge 4: give the industry end-to-end solution that makes human experts more efficient and automates routine tasks with powerful AI technologies.
We realize, that there is still a lot of work ahead: increase the number of convenient annotation tools and add the support of DICOM format, three dimensional images, sequences of images and so on. But these are only technical issues, first steps are already done and promising results are obtained.
We are passioned to accelerate medicine and happy to be a part of global research community that drives deep learning revolution to healthcare.
There could be no more important application of this new capability [deep learning] than improving patient care
— Jensen Huang, NVIDIA CEO and co-founder
Case-study: blood vessel segmentation in retina images
There are a lot of Deep Learning medical applications in imaging: tumor detection, tracking tumor development, blood flow quantification and visualization, dental radiology and much more.
Because we are not doctors, we looked for data we understand more or less. That’s why we decided to make research on blood vessel segmentation. Let’s take a look to one of the most popular public datasets in this field: STARE (STructured Analysis of the Retina).
Dataset contains 28 annotated images with resolution 999 × 960. We consider the case that we have only 6 annotated images in training dataset. Other images will be used for final evaluation of quality. All training images are below:
This scenario is pretty close to real world: medical doctor annotates few images, then neural network is trained on this data and applied to other images for pre-segmentation. Then doctor just corrects the NN predictions.
Such approach is called Human in the loop AI. It is aimed to significantly accelerate efficiency of human expert.
PS. Thanks to Supervisely entire research took 2 hours without haste ☕.
Step1: training data augmentation
We had only 6 annotated images. To train NN we have to automatically increase the size of dataset. Supervisely has special module to perform augmentations: DTL (Data Transformation Language). It allows to configure entire augmentation process in a simple json-based format and perform it in a few clicks.
In this use case we did horizontal/vertical flips and relatively big random crops. We got 264 training examples from only 6 annotated images. Here is the visualization of computational graph that we applied to our data:
Step2: train neural network
There are few state of the art Neural Networks for semantic segmentation in Supervisely. One of them — our custom UNet-like architecture. It was chosen because: we have small training dataset, it is accurate and fast to train. Also we use combination of Binary Cross Entropy and Dice losses because of class imbalance problem. Vessels pixels covers only few percents of image area in contrast to background pixels.
We trained NN 50 epochs. It is interesting to visualize Neural Network predictions during training. We take unseen image and apply NN after each epoch. Here you can see how our NN becomes smarter over time.
Supervisely supports multi GPU training. Each epoch takes around 20 seconds on four GPU. Total training time — around 17 minutes.
Step3: automatic pre-segmentation
We applied NN to new images. Let’s compare predictions with ground truth.
As you can see from this comparison every relatively bold vessels are segmented. There is no noise. It means that the human only have to draw few hairlines with “polyline” tool.
Also, as we understand real data has much bigger resolution that public data we use in this experiment. We think that this fact is crucial for the quality of hairlines segmentation. Resolution of publicly available images is not enough. Look at this example: do you see the vessels that are annotated by doctors?
Step4: manual correction
As you can see from images above the quality of automatic pre-annotation is pretty good. It is much more easier and faster to correct NN predictions than annotate from scratch manually.
We were not lazy and made time measurements: how much time we need for manual annotation from scratch vs correction of NN predictions. Manual annotation from scratch: 36 minutes / image. Correction of NN predictions: 4 minutes / image.
Conclusion is obvious.
Deep Learning has a huge potential in medical image analysis. AI is changing the way doctors diagnose illnesses.
Main important difference between doctor and deep learning algorithm is that doctor has to sleep. Neural Network can process millions of images and can be continuously improved.
Human in the loop approach and automatic segmentation with Supervisely will let us create large datasets faster. All steps are done without coding. It means that user with no ML background have access to state of the art AI. So ML community will build more services to help doctors provide better and quicker treatment.
Let’s make the future together.
If you found this article interesting, then let’s help others too. More people will see it if you give it some 👏.