An ever-increasing number of organizations are developing applications that involve machine learning components. The complexity and diversity of these applications calls for software engineering techniques to ensure that they are built in a robust and future-proof manner.
Do traditional software engineering practices apply equally to the development of applications with ML components, or do these practices need to be adapted to cater for particular ML characteristics?
To investigate this question, we have reviewed both scientific literature and popular publications to identify software engineering best practices that are recommended or used by teams that develop machine learning applications.
Before diving into the best practices that we identified, let me share some early observations about the literature we collected. For more details, you can take a look at the awesome reading list that we compiled.
Throughout the collected literature, we identified about 40 best practices. These concern for instance data versioning, feature testing, model deployment, and model performance measurement.
We are currently in the course of creating clear descriptions of these best practices, documenting their literature sources, and organizing them into development process stages.
We are also investigating to what extent these best practices are actually adopted by practitioners. For this purpose, we are engaging with practitioners through interviews as well as an online survey. This will allow us to measure adoption, but also to identify common challenges for practitioners and researchers in the field.
Help us
If you are involved in developing a machine learning application, you can help us by taking the survey and inviting your colleagues and friends to do the same.
Take the 7-minute survey!
As our research progresses, we will be sharing our results right here.
Joost Visser is professor of Software and Data Science at Leiden University. Previously, Joost held various leadership positions at the Software Improvement Group. He is the author of numerous publications on software quality and related topics.
Joint work with Alex Serban, Holger Hoos, and Koen van der Blom.
Survey: take me straight to the survey
Read: the awesome list
Project: about our research
Previously published at https://medium.com/@jstvssr/software-engineering-best-practices-for-machine-learning-9e51237e3e1