Introduction CI/CD (Continuous Integration/Continuous Delivery) pipelines are not just for web developers – they are crucial for data science and machine learning projects too. By automating testing and deployment, teams can ensure that models are reliable and reproducible. In this article, we focus on building a CI/CD workflow for a Python-based machine learning model, using Jenkins as the pipeline orchestrator and Docker to containerize the environment. A Git push to the repository can be configured to trigger Jenkins via a webhook, starting the pipeline automatically. The pipeline starts when a data scientist commits code and a model to the repository. Jenkins, integrated with the version control system, automatically detects the change and begins running the defined pipeline. In each run, Jenkins checks out the code, installs dependencies inside a Docker container, executes the tests, and then builds a Docker image if tests pass. This process ensures that every change is verified in a clean, reproducible environment before it moves forward. Building a Docker-based Reproducible Environment To maintain reproducibility, we package the ML code and its dependencies into a Docker image. A simple Dockerfile might look like this: Dockerfile FROM python:3.8

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code and tests
COPY . .

# By default, run tests
CMD ["pytest", "--maxfail=1", "--disable-warnings", "-q"] FROM python:3.8

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code and tests
COPY . .

# By default, run tests
CMD ["pytest", "--maxfail=1", "--disable-warnings", "-q"] The requirements.txt might include libraries with specific versions. Building this container in the pipeline ensures that all runs use the same Python and library versions. For example, a shell command to build this image locally would be: requirements.txt docker build -t my-ml-model:latest . docker build -t my-ml-model:latest . We specify Python 3.8 as the base image and copy the ML project files into it. The CMD at the end runs tests by default, so running docker run my-ml-model:latest would execute all pytest tests inside the container. CMD docker run my-ml-model:latest pytest Using Docker means that test results are independent of the host machine. Team members and CI servers run code in an identical environment. This makes reproducibility easy: if a test passed once inside Docker, it will pass every time the same way. In our workflow, Jenkins will use Docker to build and even run this container as part of the pipeline. For example, Jenkins could build the image and then run a container with a command like: docker build -t my-ml-model:$BUILD_NUMBER . && \
docker run my-ml-model:$BUILD_NUMBER docker build -t my-ml-model:$BUILD_NUMBER . && \
docker run my-ml-model:$BUILD_NUMBER Here $BUILD_NUMBER is a Jenkins variable that tags the image with the current build number. $BUILD_NUMBER Writing Tests for Your ML Model In the CI/CD workflow, automated testing is key. We write unit tests for our model. For example, using TensorFlow we might test that the model produces outputs of the correct shape: # tests/test_model.py
import numpy as np
import tensorflow as tf
from model import get_model

def test_prediction_shape():
    model = get_model()  # your model-building function
    test_input = np.zeros((1, 10))
    output = model(test_input)
    assert output.shape == (1, 1), "Unexpected output shape" # tests/test_model.py
import numpy as np
import tensorflow as tf
from model import get_model

def test_prediction_shape():
    model = get_model()  # your model-building function
    test_input = np.zeros((1, 10))
    output = model(test_input)
    assert output.shape == (1, 1), "Unexpected output shape" In this snippet, get_model() returns a tf.keras model expecting 10 features. The test verifies that passing a dummy input yields an output of shape (1,1). We can also test numeric outputs or behavior (for example, assert output.numpy()[0][0] == expected_value). To make results deterministic, we can fix random seeds in TensorFlow or NumPy so that the tests are repeatable. Placing this test in a tests/ directory lets pytest find and run it automatically. get_model() tf.keras assert output.numpy()[0][0] == expected_value tests/ Running the tests locally might look like: pytest --maxfail=1 --disable-warnings -q pytest --maxfail=1 --disable-warnings -q If any test fails, the pipeline should stop, preventing a bad model from being packaged. Jenkins will record these results. This automated testing ensures our model logic is validated on every change, catching errors early. Integrating Jenkins Pipeline With a Dockerized environment and tests in place, we define a Jenkins pipeline to run them automatically. In a Jenkinsfile (stored in the repository), we might write a declarative pipeline like this: Jenkinsfile pipeline {
    agent any
    environment {
        IMAGE_NAME = "my-ml-model"
    }
    stages {
        stage('Checkout') {
            steps {
                checkout scm
            }
        }
        stage('Test') {
            steps {
                sh 'pytest --maxfail=1 --disable-warnings -q'
            }
        }
        stage('Build Docker') {
            steps {
                sh 'docker build -t $IMAGE_NAME:$BUILD_NUMBER .'
            }
        }
        stage('Push Docker') {
            steps {
                // Assuming Docker Hub login is configured
                sh 'docker push $IMAGE_NAME:$BUILD_NUMBER'
            }
        }
    }
    post {
        always {
            echo 'Pipeline completed.'
        }
    }
} pipeline {
    agent any
    environment {
        IMAGE_NAME = "my-ml-model"
    }
    stages {
        stage('Checkout') {
            steps {
                checkout scm
            }
        }
        stage('Test') {
            steps {
                sh 'pytest --maxfail=1 --disable-warnings -q'
            }
        }
        stage('Build Docker') {
            steps {
                sh 'docker build -t $IMAGE_NAME:$BUILD_NUMBER .'
            }
        }
        stage('Push Docker') {
            steps {
                // Assuming Docker Hub login is configured
                sh 'docker push $IMAGE_NAME:$BUILD_NUMBER'
            }
        }
    }
    post {
        always {
            echo 'Pipeline completed.'
        }
    }
} In this Jenkinsfile, there are four stages: Checkout (retrieve code), Test (run pytest), Build Docker (build the image), and Push Docker (push to a registry). The environment block defines variables like IMAGE_NAME. Notice how we call shell commands with sh. In a real setup, credentials (for Docker Hub or other tools) would be stored securely in Jenkins, often as encrypted secrets. This snippet shows the conceptual flow and the steps needed. Checkout Test Build Docker Push Docker environment IMAGE_NAME sh Listing the pipeline steps clearly: Checkout code: Jenkins pulls the latest code from Git (triggered by a commit or webhook).Run tests: Execute pytest in the Jenkins workspace (with dependencies installed).Build image: If tests pass, build a Docker image containing the code and model.Push image: Optionally, push this image to a Docker registry for deployment. Checkout code: Jenkins pulls the latest code from Git (triggered by a commit or webhook). Checkout code: Run tests: Execute pytest in the Jenkins workspace (with dependencies installed). Run tests: pytest Build image: If tests pass, build a Docker image containing the code and model. Build image: Push image: Optionally, push this image to a Docker registry for deployment. Push image: Finally, each stage can be visualized as a simple flow: [ Git Commit ] -> [ Jenkins CI ] -> [ Tests & Validation ] -> [ Docker Build ] -> [ Registry/Deploy ] [ Git Commit ] -> [ Jenkins CI ] -> [ Tests & Validation ] -> [ Docker Build ] -> [ Registry/Deploy ] This automated flow means that any code change triggers the pipeline to run. It catches errors early and ensures that only validated models proceed to deployment. Conclusion By combining Jenkins and Docker, we create a robust CI/CD pipeline for data science. Docker ensures that the exact environment (operating system, Python version, and libraries) is consistent and reproducible. Jenkins orchestrates the workflow, automatically running tests and building images on each commit. This makes model development more reliable: every change is tested in isolation and results are repeatable. Teams working with Python and TensorFlow benefit from this automation because it reduces manual steps and human error. For example, merging a new data processing feature or tuning hyperparameters will run through the same tests, keeping everyone in sync. With CI/CD in place, teams can deliver models faster and with greater confidence. Each code change triggers the same validated sequence of tests and builds, so errors are caught immediately. Jenkins provides logs and dashboards for each run, making it easy to see what succeeded or failed. This approach scales as the project grows, helping to maintain a consistent and trustworthy deployment process. In the end, data scientists can focus more on improving model performance and less on deployment details. Overall, this automation frees data teams to iterate quickly on models without worrying about deployment issues.

Scaling DevOps Without Losing Your Mind (or Your SLA)

CI/CD for Data Science: Automating Model Testing with Jenkins and Docker

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

LLMOps: DevOps Strategies for Deploying Large Language Models in Production

5 Best Microservices CI/CD Tools You Need to Check Out

Best CI/CD Tools to Consider in 2022

Building a CI Pipeline with Databricks dbx Tool and GitLab

Building Continuous Delivery Pipeline using CDK Pipelines Modern API

Comparing The Top 10 Kubernetes CI/CD Tools

LLMOps: DevOps Strategies for Deploying Large Language Models in Production

5 Best Microservices CI/CD Tools You Need to Check Out

Best CI/CD Tools to Consider in 2022

Building a CI Pipeline with Databricks dbx Tool and GitLab

Building Continuous Delivery Pipeline using CDK Pipelines Modern API

Comparing The Top 10 Kubernetes CI/CD Tools

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps