Researchers Develop AI to Spot Early Signs of Cerebral Palsy in Infants

Brain MRIs play a vital role in diagnosing serious health conditions in infants, from tumors to neurodegenerative diseases. Detecting abnormal brain development during infancy can guide early interventions that may prevent or reduce the impact of conditions like cerebral palsy. However, scanning such young patients carries risks, as the procedure requires general anesthesia. That's why doctors need tools that speed up diagnosis, reduce risks, and make timely, informed decisions.

Researchers at the Saint Petersburg State Pediatric Medical University partnered with the Yandex School of Data Analysis (SDA) and the Yandex Cloud Center for Technologies and Society to develop an AI solution for assessing infant brain development from MRI scans. For suspected cases of cerebral palsy and other central nervous system disorders, the solution acts as a decision-support tool, reducing MRI analysis time from several days to just minutes.

My name is Yulia Busygina, and I'm the project lead at Yandex Cloud. Together with Professor Alexander Pozdnyakov, I'll take you behind the scenes — and share how we designed the AI solution, trained the model, and tested it to see how it performs in real-world scenarios. To learn more about the project, check out ourGithub.

Why MRI scans are critical for infants

An infant's brain develops at an incredible pace, changing almost week by week during the first year of life. But it’s not just about getting bigger — the brain is also going through critical processes collectively known as cerebral development. One of the most important of these is myelination.

Myelination is the formation of a lipid-rich sheath around nerve fibers, which increases lipid content and reduces water. This process begins in the fifth month of fetal development and continues at full pace until about age two. In the central nervous system, myelin is found mainly in white matter, where it acts as an electrical insulator.

Healthy myelination allows for a rapid, reliable communication between neurons in the later years. If the brain develops more slowly than expected, it can lead to developmental delays. Тhe human brain is a complex system that requires careful attention from the very first days of life. Disorders can arise if brain growth is either too slow or too fast.

Moreover, the complexity goes beyond growth rate. In some conditions, the brain's volume remains unchanged while its tissue density shifts. Whether myelination is abnormally slow or excessively fast (hypermyelination), it can create conditions that lead to neurological disorders.

— Alexander Pozdnyakov, MD, Professor and Head of the Department of Medical Biophysics, Saint Petersburg State Pediatric Medical University.

Infants with abnormally slow myelination have a higher risk of developing cerebral palsy. Cerebral palsy is one of the leading causes of childhood disability that affects 2–3 out of every 1,000 newborns. Monitoring cerebral maturation in the first six months of life can be crucial for timely intervention. For patients at risk, acting quickly with the right therapies and rehabilitation can prevent damage and halt cell death.

Some patients present with conditions that are poorly understood and difficult to classify. But even in these cases, we can anticipate risks and safely intervene in brain development.

Such interventions may include medication or brain-stimulation techniques to accelerate maturation when needed, or slow it to normal levels when it's abnormally fast.

— Alexander Pozdnyakov, MD, Professor and Head of the Department of Medical Biophysics, Saint Petersburg State Pediatric Medical University.

On MRI scans, myelinated white matter stands out clearly from areas where myelination is incomplete. In patients under 12 months, however, distinguishing white matter from gray matter is often difficult. This is important because gray matter forms the cerebral cortex, the brain's hub for cognitive processes.

When radiologists analyze MRI scans during this stage of brain development, they face two main challenges:

Differentiating between white matter and gray matter.
Determining the volume of gray matter and white matter.

Through radiologists’ analysis, clinicians can study how nerve cells move through white matter toward the cortex, creating the brain's neural pathways. Observing these changes over time reveals whether the cortex is thinning or thickening and if the white matter is fully developed.

For infants, an MRI is ordered by the attending physician only when there are serious clinical indications. These may include birth-related nervous system injury, brain trauma, seizures, or suspected epilepsy. Because patients under six years of age require general anesthesia to stay still during the procedure, MRIs are performed on young children only when truly necessary.

Here's how the procedure usually goes:

The medical team places the patient under general anesthesia, positions them in the MRI scanner, and captures the images. The procedure typically takes about 30 minutes but can last up to 40–50 minutes.
These images are then processed. A specialist can calculate the volumes of white matter and gray matter using a three-axis formula. Clinical guidelines define the timeframe for this analysis, which can take up to 72 hours in complex cases.

If it's a follow-up scan, the analysis takes even longer because the data must be evaluated against previous results from different time points.

How AI can help

Existing methods for assessing brain myelination in children under one year old often involve subjective factors. Experienced radiologists can usually determine from the images whether the white matter volume is sufficient. In straightforward cases, 30 minutes to 2 hours of review directly at the MRI console is adequate, and AI is not needed.

The task becomes much more challenging when radiologists need to compare several studies over time. Even a single brain MRI involves reviewing multiple images (at least 22 slices). In complex cases, analyzing more than a thousand images may be necessary, making it impossible to review everything quickly.

Computer vision can help radiologists by flagging areas where changes in the contours of white matter and gray matter are most likely. This could also serve as an invaluable training aid for junior doctors and residents. This solution provides several benefits for early-age scanning:

Speed up the analysis process.
Optimize follow-up schedules to ensure scans are performed only as often as needed, avoiding unnecessary anesthesia.
Enhance radiologists’ capacity to examine more patients.

At first glance, it might seem like we could simply reuse available open-source datasets and pretrained models for this purpose.

After all, similar problems have already been tackled using AI in machine learning competitions. For example, the 2019 MICCAI Grand Challenge focused on segmenting MRI images of infant brains under six months. Developers from around the world attempted to solve the challenge using the iSeg-2019 dataset.

However, the existing dataset lacked the necessary annotations — the segmentation masks that identify which areas of an image correspond to gray matter or white matter. The iSeg-2019 dataset included only 15 annotated images, while the university's six-year archive contained MRI scans from 1500 patients with no annotations at all.

This meant our first step was preparing the data.

How to turn MRI scans into a dataset for machine learning

The Yandex Cloud team came up with a cloud-based application architecture, helped select the right tools, and assisted with configuring and testing the final web service. Guided by mentor Arseniy Zemerov, students from the Yandex School of Data Analysis handled the core ML tasks: choosing the neural network architecture, running experiments, and training the model on the annotated data. The most complex task — data annotation — was a true team effort, with expert radiologists from the Saint Petersburg State Pediatric Medical University providing critical expertise.

Here's a high-level overview of the data pipeline.

Let's look at the first stage.

Loading the raw data. MRI scans are stored in a picture archiving and communication system (PACS) designed for managing medical images in DICOM format. This system archives and processes anonymized scans, which form the core of the model's training dataset. To deploy this system, we set up a virtual machine in Yandex Compute Cloud. We uploaded anonymized MRI studies of children under 12 months from the university's archive, together with the iSeg-2019 data.

Each study is a collection of MRI images captured in different modes: T1, T2, FLAIR, and DWI. These modes highlight different tissue characteristics, helping clinicians better differentiate between various conditions (for details, see this article). To meet these requirements, this system stores additional metadata and treats multiple MRI slices as a single, unified study. This ensures that no personal data is stored on the server because we work only with an anonymized dataset.

Annotating the collected data. For patients over a year old, radiological brain images can be annotated using automated tools such as the open-source 3D Slicer, which calculates white matter and gray matter volumes. However, these methods are not effective for younger patients. On MRI scans of newborns, even seasoned radiologists may find it hard to distinguish white matter from gray matter, making annotation a meticulous, pixel-by-pixel task.

Initially, we planned for expert radiologists to annotate the raw data from scratch, but the process proved far too time-consuming: a single study with just a few slices could take eight hours or more. With this manual approach, the experts annotated about 30 studies, each containing three slices.

To accelerate the process, our ML specialists proposed performing pre-annotation using an open-source model called Baby Intensity-Based Segmentation Network (BIBSNet). The network is based on the nnU-Net framework and is designed to identify white matter and gray matter.

After reviewing the pre-annotation results, we found there was still plenty of room to improve many of the metrics. The inference time for a single volume was about 2.5 minutes. To accelerate the process, the team scaled up the computations:

The BIBSNet Docker container was adapted for parallel execution.
The container was deployed on 20 virtual machines, each processing data independently.

This cut the pre-annotation time for the entire dataset, making it possible to assess the algorithm's performance on it. According to our expert radiologists, pre-annotations were useful in 40% of cases, and that alone helped reduce the manual workload. Our ML specialists also benchmarked BIBSNet’s performance in segmenting gray matter (GM) and white matter (WM) on T1-weighted (sagittal) and T2-weighted (axial) MRI scans.

As a result, pre-annotation helped us build an annotated dataset of about 750 slices. This was enough to train and evaluate machine learning models for segmentation and detection. Before running the experiments, we split the dataset into training and validation sets, using the latter to check our metrics.

Inside our model training experiments

Initially, we planned to try a more advanced architecture, the Vision Transformer. However, we soon realized that this architecture was well-suited for healthcare purposes. The model was prone to hallucinations, which could do more harm than good.

So, we chose a segmenter built from two types of neural networks:

Convolutional neural networks serve as feature extractors (backbones), adapting well to tasks beyond classification (such as segmentation).
Architectures explicitly developed for medical imaging, with the U-Net as the primary choice.

The Yandex School of Data Analysis team aimed to develop a segmentation model that could be as accurate as BIBSNet but deliver a much faster inference time.

To achieve this, the students ran a series of experiments on the iSeg-2019 dataset. For the neural network architectures, they examined U-Net, U-Net++, and DeepLabV3. For the feature extraction backbones, they tested ResNet-50, ResNet-101, ResNeXt-50, ResNeXt-101, and DenseNet-161.

During the experiments, the team tried several approaches:

Training purely on target slices (2D):

We trained a 2D segmentation model exclusively on manually annotated slices (the initial 30 studies).

However, the small amount of training data limited the model's capacity to generalize.
Training on combined iSeg-2019 and target data (3D):

The model was trained on a combined dataset that included the fully annotated iSeg-2019 data and our target data.

For slices in the target studies without annotations, we applied a mask that zeroed out their contribution to the loss function. This ensured that only annotated slices from the target data were used, preventing errors during training.

What we discovered

Here's what our experiments with different feature extraction backbones showed:

Experiments with the network architecture:

The best-performing experiment was training a U-Net with a ResNeXt50 backbone using the DiceLoss function.

Here's how the model performs on the validation set. Original study from the validation set:

Example output from the algorithm

Final metrics:

	iou void	iou GM	iou WM	iou mean
unet_resnext50_32x4d_dice_1	0.981	0.629	0.501	0.703

The inference speed of the trained neural network running on a CPU is about 3 seconds.

How it currently works

Our solution, now available on GitHub, was designed as a web service for radiologists performing MRI scans on infants. They can upload the acquired files to the service right after the procedure. The system anonymizes the uploaded data, removing all personally identifiable information, such as the patient's name, from the records.

The solution automatically identifies gray matter and white matter areas on each MRI slice, providing predictions with confidence scores.

The service is primarily morphometric, meaning it measures tissue volumes. Once processing is complete, users see the model's predicted volumes of gray matter, white matter, and cerebrospinal fluid, along with descriptions of the largest structures.

From the summary table, you can select a specific study to view the scan with the white matter and gray matter masks applied.

Our experiments show an accuracy of over 90%. We expect this figure to improve as we expand the dataset and continue fine-tuning the model, which was initially trained on limited data.

Looking ahead, the project's roadmap goes beyond basic segmentation. Our next step is to calculate the GM-to-WM ratio, which can provide clinicians with deeper insights.

The neural network has been tested at the Saint Petersburg State Pediatric Medical University, and the researchers are ready to share their findings with other medical institutions. The solution reduces MRI interpretation time for radiologists from several days to just a few minutes.

Once testing is complete, we plan to release the solution as open source for use in medical institutions and research projects worldwide. The solution also has significant scientific potential. Because infant brain volumes weren't previously measured at scale, no fundamental studies have yet examined changes in brain volume across large cohorts. For various conditions and pathologies, this research can help refine medical care standards.