👨💻 H2Oai 🎙 CTDS.Show & CTDS.News 👨🎓 fast.ai 🎲 Kaggle 3x Expert
Index to “Interviews with ML Heroes”
Alexandre is an MD, Radiologist and a Computer Engineer. He is also a Deep Learning Practitioner, Kaggle Competition Expert (Ranked #72). He is actively working in the application of Deep Learning in the Medical Domain.
About the Series:
I have very recently started making some progress with my Self-Taught Machine Learning Journey. But to be honest, it wouldn’t be possible at all without the amazing community online and the great people that have helped me.
In this Series of Blog Posts, I talk with People that have really inspired me and whom I look up to as my role-models.
The motivation behind doing this is, you might see some patterns and hopefully you’d be able to learn from the amazing people that I have had the chance of learning from.
Sanyam Bhutani: Hello Alexandre, Thank you so much for taking the time to do this interview.
Dr. Alexandre Cadrin-Chenevert: Hi Sanyam! Thank you very much for your kind invitation.
Sanyam Bhutani: Today you’re working as a radiologist and are actively working on ML Research focused on your domain.
Can you tell us when did Deep Learning first come into the picture, What got you interested in Deep Learning at first?
Dr. Alexandre Cadrin-Chenevert: Back in 2016, after 7 years of clinical practice in radiology, I read some information about a startup company based in San Francisco named Enlitic. They were using deep learning algorithms to improve the quality and accessibility of medical imaging services. I got even more curious when I realized that the founder Jeremy Howard was working with Rachel Thomas on the first iteration of a free online deep learning course named fast.ai.
During my early clinical years in a public and well-funded healthcare system in Canada, I observed a significant divergence between the growing needs and allowed resources for healthcare services in the aging population. That was particularly true in radiology. In this context, the opportunity offered by deep learning applied to computer vision was absolutely compelling in my field. With a worldwide perspective, this opportunity was, even more promising to balance the accessibility and universality of healthcare services.
Almost everything in my life was reasonably suggesting me to stay with a high-level perspective about deep learning. But, instead of approaching this subject in the surface, I decided to wake up the young software developer that I was almost 20 years ago and I dove deeply in the field, mostly on the computer vision side. Online courses (fast.ai, Coursera) and open science (arxiv, Kaggle, open access articles) were definitely mandatory to make this self-learning possible after work hours.
Sanyam Bhutani: You’re working on an amazing intersection of Radiology, Healthcare, and Deep Learning.
Could you tell us more about your Research?
Dr. Alexandre Cadrin-Chenevert: The research community around deep learning in radiology is mainly formed by the union of 3 different research communities: clinical radiology, medical imaging analysis, and deep learning communities. Learning algorithms force all these groups to collaborate together to develop useful and meaningful solutions to clinical problems.
The clinical radiology community is currently discovering tons of supervised learning classification applications, mostly in subspeciality silos. The medical imaging analysis has a strong background with segmentation tasks and is massively moving to fully convolutional encoder-decoder networks with U-Net as a very popular and effective instance. The deep learning community in computer vision is definitely more focused on unsupervised learning algorithms.
Consequently, my main research interests are at the intersection of these research communities.
First, I have a strong interest in creating publicly available and strongly labeled datasets. Radiologists can lead this path by defining useful clinical tasks and curating large datasets available to the research community. For example, I have total respect for Ron Summers and his team at NIH who created the ChestXray public dataset with more than 112000 labeled images. Stanford is also paving the way with different large labeled datasets (Bone age, upper extremity radiographs). There are many more initiatives in this way including the recent RSNA Pneumonia detection dataset. Of course, ethical issues, like strong deidentification and data security, are challenging issues to overcome.
I also have a special interest in object detection tasks. Convolutional neural networks are basically mapping spatial information into semantic information. I consider the main computer vision tasks, classification, detection, and segmentation as different operating points in this spatial-to-semantic continuum. In my opinion, for supervised learning algorithms in medical imaging, the object detection task is the clinically useful sweet spot between spatial and semantic information. Classifications tasks, even if very useful, lack some objective spatial interpretability needed by radiologists. There are many known solutions to the interpretability issue with classification networks, but most of them are subjective. Object detection with a bounding box offers implicit and objective measurable interpretability which is a great benefit. On the other side, segmentation task offers the best of spatial interpretability but with a lot more resources needed to label images and with a tradeoff on semantic information.
Finally, I also have an interest in unsupervised learning algorithms in medical imaging. Normal anatomy in humans has a relatively low statistical variance. On the other side, pathology frequently has high variance with very different visual representations. Cost of strong labeling in medical imaging is still relatively high, which is a limiting factor to implement some supervised learning algorithms. Consequently, there is definitely a significant place for anomaly detection with unsupervised learning algorithms.
Sanyam Bhutani: As a Domain Expert, what are your thoughts of using an ML Algorithm in an application where a Human Life is under consideration?
How can we ensure that we’re building the right tools or what steps should we take to ensure the Technology works in a positive manner?
Dr. Alexandre Cadrin-Chenevert: For the available resources, most societies want the best health outcome for their population.
When accessibility is not a huge problem, you basically want to improve the quality of health services to improve the outcome. There is an incredible potential to improve the quality of health services by combining the work of a medical expert with the output of an ML algorithm. For example, in radiology just lowering the variability of human interpretation can be a significant beneficial step.
When accessibility is the main bottleneck, then ML algorithms can potentially help to triage the population into different levels of health priority. This ultimately can improve accessibility and outcome of your population with the same amount of human and financial resources.
Of course, there is a significant statistical trap with machine learning algorithms. And this trap is even wider with deep learning algorithms. Data scientists usually present performance metrics of a trained algorithm based on an unseen test dataset. This is the usual proof of performance generalizability of an algorithm. But in healthcare, in clinical practice, there are tons of non-controlled and hidden variables in the data that can be very different compared to the test data distribution. These variables are related to genetic, gender, age, risk factors, symptoms, types of acquisition machines and many more. This is inducing a bias between the test and the clinical data. And this bias is frequently not directly measurable. But this bias in clinical data can generate very large variations of performance of the ML algorithm.
So, I really recommend to anyone who considers using an ML/DL algorithm in clinical practice to strongly validate the reported performance on its own data. But this validation process can be very time consuming and difficult because most of the clinical institutions are not organized to do this kind of validation. But, I am definitely convinced about this need; I would even recommend explicit local validation even if an algorithm is already cleared by national regulators.
Sanyam Bhutani: Congratulations on your recent Kaggle win in the RSNA Pneumonia challenge.
Could you tell us about your kaggle journey?
What are your thoughts about using kaggle as a testbed to enhance your DL skillset?
Dr. Alexandre Cadrin-Chenevert: Thanks Sanyam. My teammate Ian Pan and I learned so much during this intense Kaggle challenge.
Participating in Kaggle competitions was of paramount importance in my learning curve of deep learning. Maturation of almost any learning process is based on the transformation of knowledge to skills, or concept to actions. Each competition is an opportunity to evolve in this direction. But, naturally, the tradeoff is the time you invest in these competitions. So my own perspective was to participate in a limited number of computer vision competitions selected to catch efficiently most of the potential benefit.
It is an ideal scenario to apply the needed iterative process of trying experiments by yourself to solve the problem, and then coming back to the forums to learn from the kernels and the forums posts. If you discover an interesting kernel then try to deeply understand what it is doing and why it was written that way. Some of these kernels are hidden gems and you need to meticulously unpack them to discover the value.
Sanyam Bhutani: I’m also honored to be a fellow student of yours in the fast.ai community.
Could you share your fast.ai experience? What are your thoughts about using the new library (v3) in Medical research?
Dr. Alexandre Cadrin-Chenevert: It is fascinating to see how the fast.ai students are tightly bound together. We are like a big international family. The democratization of deep learning is an important social mission and we are all part of it.
Since 2016, the different versions of the fast.ai courses changed significantly and the fast.ai library was written from scratch. In my opinion, that blog post from Jeremy was historical in the transformation.
But the fundamental concept was always to allow learning with a top-down approach which is a specific characteristic of the course and the library. It allows usage of very powerful deep learning tools and techniques with a minimum amount of code. Consequently, beyond the teaching benefit, the fast.ai library is an amazing fast-prototyping library to rapidly train deep learning models with a reasonable amount of computing resources. This high-level concept of fast.ai over pytorch is similar to Keras over Tensorflow.
In medical research, the number of applications of deep learning is huge. But for all these potential applications, it is not always possible to bring together an experienced team of data scientists with an experienced medical research team. The fast.ai course and library can definitely help to lower the bar to implement medical research. That is why I recommend the course to all the tech-savvy radiologists and physicians that are interested to learn more about deep learning. I hear many histories of radiologists who are learning python to follow fast.ai and bring these concepts to actions. And if someone has an interesting idea of medical application but without any coding experience, I still recommend to follow the course with a friend who is coding. Just a pair formed by a computer scientist and a physician is enough to start medical research with the fast.ai course and library as a communication catalyzer.
Sanyam Bhutani: What developments are you most excited about in the ML and Medical Research intersection?
Dr. Alexandre Cadrin-Chenevert: Beyond the predictive tasks of machine learning, I’m really excited by new statistical correlations not directly visible by the human eye that are found by machine learning algorithms. These new correlations in the data will push us to find new causal links to better understand and explain the observations.
Merging genetic and imaging databases is particularly promising and will most likely allow us in a not so distant future to find tons of new anatomical, physiological and pathological correlations to better understand the evolution of the human body.
Sanyam Bhutani: What is your best advice for a non-domain expert looking to apply ML to medical studies?
Dr. Alexandre Cadrin-Chenevert: Don’t work in silos. Applying machine learning to a medical problem is a multi-disciplinary challenge. Communicate and share your ideas with clinical experts. They will add invaluable insights and will save you precious time.
Sanyam Bhutani: How do you stay up to date with the cutting edge?
Dr. Alexandre Cadrin-Chenevert: Quantity of research in deep learning and reinforcement learning is growing exponentially. So it is not humanly possible to follow everything in all sub-areas of research. So choosing a specific interest is the first step to keep some control.
Then, you need to find some structured sources to be informed of the most important advances related to your interest. In my case, I’m mostly using arxiv-sanity, twitter, Kaggle, best conference paper awards, fast.ai forums, and medical journals to keep the pace. Eventually, your own filters will become sharper to divide the important information from the background noise.
Sanyam Bhutani: Before we conclude, any advice for the beginners who even though are excited about the field, feel overwhelmed to even get started with Deep Learning?
Dr. Alexandre Cadrin-Chenevert: To conclude, I’ll cite this guy, named Steve Jobs, who is a lot better than myself to give advices:
“You’ve got to find what you love. And that is as true for your work as it is for your lovers. Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t settle. As with all matters of the heart, you’ll know when you find it. And, like any great relationship, it just gets better and better as the years roll on. So keep looking until you find it. Don’t settle.”
TL;DR Don’t settle.
Sanyam Bhutani: Thank you so much for doing this interview.
If you’re interested in reading about Deep Learning and Computer Vision news, you can checkout my newsletter here.