We are studying the emerging discipline of Machine Learning Engineering by investigating best practices for developing software systems that include ML components. In this article, we share the research motivation and approach, some initial results, and an invitation to help us by taking our 7-minute online survey on ML Engineering best practices.

(Photo by Franck V. on Unsplash)

Engineering Machines that Learn

Machine Learning is key to the new wave of AI

Artificial Intelligence (AI) is undeniably experiencing a new wave of attention, energy, and sky-high expectations. This wave is driven by the abundance of data that is generated in our connected, digital society, and by the low-barrier availability of enormous computational resources.

Among various AI-techniques, Machine Learning (ML) in particular has come to play a key role.

The current surge of Artificial Intelligence is driven by Machine Learning, as indicated by relative interest in search terms according to Google Trends.

Learning complex behavior from examples

Machine Learning allows us to solve complex problems, not by arduously writing new code, but by letting an existing algorithm learn new behavior from examples. We are now witnessing break-through results in image recognition, speech processing, medical diagnostics, securities trading, autonomous driving, product design and manufacturing, and much more.

Does Machine Learning replace programming?

Does the rapid ascent of Machine Learning mean that software systems will no longer need to be programmed? Will we need data scientists instead of software developers?

To those that have experienced software-related project delays, system outages, and indefinitely incomplete feature sets, a world without programmers might seem attractive.

Does Machine Learning require programming?

But no so fast. There are several reasons why Machine Learning will not replace programming, but rather make the software engineering discipline even richer and more complex.

ML algorithms are themselves software that needs to be developed, tested, and maintained.
Using an ML algorithm requires programming, for the tasks of ingesting, cleaning, merging, and enhancing data, for feeding the data into the ML algorithm, for running repeated training experiments to generate, evaluate, and optimize an ML model, and for testing, integrating, deploying, and operating ML models in production systems.
Trained ML models are just one building block in the construction of complex software systems.

So, what is different?

Still, there are specific characteristics of Machine Learning that challenge traditional software development practices. The amount of data to manage is typically much larger for applications that involve Machine Learning components. The development process tends to involve more rapid-cycle experimentation, where alternative solutions are routinely attempted, compared, and discarded. And the level of inherent uncertainty in the final product is higher.

Emergence of the Machine Learning Engineer. Relative interest in search terms according to Google Trends.

ML Engineering

Around the globe, numerous organizations are learning step-by-step how to develop software systems that include ML components. With an increasing number of people self-identifying as ML Engineer, the discipline of Machine Learning Engineering is emerging. This raises interesting questions:

Is ML Engineering distinct from Software Engineering? Or is one a sub-discipline of the other?
Do established Software Engineering best-practices apply equally when building software systems with ML components? Or do these best-practices need to be modified or replaced?
Can a canonical set of ML Engineering best-practices be identified by which practitioners can be guided and newcomers can be educated?

Investigating ML engineering practices

To investigate these questions, researchers in the fields of Software Engineering and Machine Learning have teamed up.

We have started with an extensive review of both scientific and popular literature, to identify which practices are described and recommended by practitioners and researchers. These practices range from data management (e.g. how to deal with storage and versioning of large data sets), through model training (e.g. how to run and evaluate training experiments), to operations (e.g. how to deploy and monitor trained models).

Aspects of ML Engineering organized into groups of practices.

Surveying the adoption of ML Engineering practices

We then embedded the identified practices in a survey among representatives of teams that build software with ML components. This survey is currently in progress and open for new participants (see below). At the time of writing about 200 teams have participated. Early results show that larger teams tend to adopt more engineering practices.

Early results of our global survey on the adoption of engineering practices by Machine Learning teams. Larger teams tend to adopt more practices.

Also, early results tell us that some practices are widely adopted, and can be considered basic, while other practices are only applied by more experienced teams in larger organizations, and can be considered advanced.

An example of a more advanced practice is the use of so-called automated machine learning techniques, where teams are able to do model selection and hyper-parameter optimization in an automated way. Early survey results indicate that these techniques enjoy much stronger adoption in tech companies and (academic) research labs than in non-tech companies and government.

Early results of our global survey on the adoption of engineering practices by Machine Learning teams. Teams in tech companies, universities, and non-commercial research labs tend to make much more use of automated machine learning techniques than teams in non-tech companies and governmental organizations.

Towards a ML Engineering best-practice catalogue

We are using the results of our survey to organize the best practices into a comprehensive catalogue. In the catalogue, each ML engineering practice is recorded in a uniform structure, much like design patterns and refactorings have been catalogued in the past.

Elements of the structure include the intent and motivation of the practice, its applicability in various contexts, the interdependencies with other practices, and a short and actionable description of how to apply the practice. We also provide references to literature and supporting tools.

Using the survey results we are also able to quantify the difficulty of each practice. This helps us to sort them into difficulty levels from basic to advanced, giving guidance to teams to prioritize their adoption.

Our ultimate objective is that the resulting catalogue will help the formation and effectiveness of ML Engineering teams, not only in the larger tech companies where ML Engineering already enjoys strong adoption, but also in smaller and non-tech organizations.

Take the survey!

If you are part of a team that builds software that includes Machine Learning components, please help us by taking our survey.

Take the survey: https://se-ml.github.io/survey/

Joost Visser is professor of Software and Data Science at Leiden University. Previously, Joost held various leadership positions at the Software Improvement Group. He is the author of numerous publications on software quality and related topics.

Joint work with Alex Serban, Holger Hoos, and Koen van der Blom. For more information, consult the SE4ML project website.An earlier version of this article was published in Bits & Chips.

On the Relevance of Software Engineering for the Development of ML based Software Systems

Engineering Machines that Learn

Take the survey: https://se-ml.github.io/survey/