paint-brush
Approaching AI Model Deployment the More Efficient Wayby@modzy
149 reads

Approaching AI Model Deployment the More Efficient Way

by ModzySeptember 20th, 2021
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The last component of ModelOps and MLOps pipelines is the production deployment stage. This stage occurs after a model is trained and reaches a suitable level of performance and is ready to make predictions against live data. On average, it takes organizations about nine months to deploy models into production. An estimated 50-90 percent of all machine learning models are in the metaphorical “AI Valley of Death” The skills gap required to build AI-powered solutions and software systems is one of the major drivers.

Company Mentioned

Mention Thumbnail
featured image - Approaching AI Model Deployment the More Efficient Way
Modzy HackerNoon profile picture


Organizations using Artificial Intelligence (AI) and Machine Learning (ML) solutions face a challenging problem: deploying these capabilities into production systems. The last component of ModelOps and MLOps pipelines is the production deployment stage. This stage occurs after a model is trained and reaches a suitable level of performance and is ready to make predictions against live data. See the previous posts in our ModelOps pipeline series to learn more.


So why is this a challenge? On average, it takes organizations about nine months to deploy models into production, leaving an estimated 50-90 percent of all machine learning models in the metaphorical “AI Valley of Death.”


The skills gap required to build AI-powered solutions and software systems is one of the major drivers of the AI Valley of Death. Only a fraction of production ML systems are composed of actual ML code. As a result, data scientists are left facing the daunting task of building a production system without the skillset to do so. This often leads to quick and cheap solutions, which ultimately generate significant technical debt, costing organizations more in the long run.


To address this monumental need for a solution that makes data scientists’ lives easier, several AI model deployment options have surfaced in the market in recent years from the open-source community, cloud providers, software startups, and more. While these products are sufficient in some scenarios, many have significant gaps that prevent them from offering a robust solution. Namely, these tools fall short because they:


  • Support models only native to their platform
  • Require a difficult set up and large learning curve
  • Do not provide out-of-the-box horizontal scaling
  • Offer limited deployment options as organizations increasingly look for support for multi-cloud compatibility, edge, or on-premise deployments
  • Lack robust monitoring features, like drift detection, explainability, and more


Modzy is built to help data scientists seamlessly deploy their models to tailored production systems without having to worry about any of these persistent challenges.

Our Solution

Modzy’s model deployment solution provides data scientists the tools to integrate the deployment stage into their existing MLOps pipeline, regardless of the tools, custom implementations, or training platforms they are using. Given the vast number of frameworks and customizations that go into data scientists’ ML processes, why should you have to deviate from the status quo? With Modzy, there are several choices to best fit your own paradigms—including a deployment tool within the Modzy interface and Modzy’s Python SDK.


The only pre-requisite to leveraging Modzy’s different deployment options is that all models must be containerized to meet the open-source container specification. This container specification comes in a few different flavors – gRPC protocol, REST protocol, or Open Model Interface – each providing data scientists the flexibility to choose what fits best for them.

How Modzy’s Model Deployment Tool Works

After you have containerized your model to meet one of the above specifications, save your container image as either a .tar file or push it to a Docker registry of your choice. At this point, you can begin the deployment process by selecting the “Deploy Model” option within your personal model library.


Figure 1. Modzy Model Deployment InterfaceOnce your container image finishes uploading, follow a few simple steps to complete the full deployment:


  • Configure the hardware and memory requirements for your model container (customizable to your infrastructure)
  • Define the input and output specifications for your model so users understand how to submit inference API calls to your model and know what to expect as output
  • Document information about your model (technical details, training data, performance metrics, and whatever else you prefer to include)


Now, you can publish your model, begin running production inference, and leverage all the monitoring features Modzy offers.

How the Python SDK Works

Two popular model training tools leveraged by data scientists are AWS Sagemaker and MLFlow. Both tools provide users an intuitive set of APIs to train, evaluate, and save ML models. While these tools are commonly leveraged for model development, their model deployment equivalents do not field the same widespread usage. With Modzy, users have automatic containerization and deployment support for both these model frameworks into Modzy’s Python SDK. To begin using this capability, you need (1) the raw output of a trained Sagemaker or MLFlow model, and (2) the model artifacts saved in an AWS s3 bucket or Azure Blob.


Once you have met these requirements, use Modzy’s Python SDK in your preferred editor (my tip: Jupyter Notebooks work great) and follow a few steps to deploy your model to Modzy:


  • Complete a yaml file with documentation about your model and provide any additional metadata files for the specific model type that you chose. In the case of image classification, you will need to provide a labels.json file that contains a mapping between numerical classes and human readable labels.
  • Include credentials required to access your cloud storage blob that contains your model artifacts, specify the path to your weights file(s) and additional model resources, define your model type (e.g., Image Classification), and pass this information to Modzy’s Model Converter through the SDK.
  • Execute the converter and see your model deployed to your environment in just minutes.


For data scientists and developers who wish to add a more programmatic approach to their MLOps pipeline, Modzy’s SDK deployment capability unlocks the potential for automation, improved efficiency, and most importantly, a significant improvement in speed to production.


Using Modzy’s AI model deployment solution makes this process easier for all data scientists and all existing MLOps pipelines. It also gives way to the many features the Modzy platform offers once models are in production. These features include the model versioning management, intuitive APIs for model inferencing, standardized infrastructure management, auto-scaling for resource management, and comprehensive monitoring capabilities including drift detection, explainability, model health checks, and more. Data scientists no longer need to modify their existing training processes to deploy their models into production and face the AI Valley of Death. Using Modzy, a seamless deployment is possible.