Docker changed the software development game by packaging an application and its dependencies together. It largely eliminated the pain of onboarding developers and deploying applications. But there was an upside to the old way where an oil-covered sysadmin manually built an environment for an application. As long as nothing was touched, it would run until the server gave out.
Today, it's quite common for an application to be built and deployed many, many times a day. If dependencies for your application are unstable, you have to be careful. Say your app depends on something like Joe's gcsfuse library for "mounting and accessing Cloud Storage buckets as local file systems."
This might be pretty crucial to the functioning of your application. It lives in a vendor repo on the internet and might get installed into your docker file something like this:
FROM python:3.9-buster
RUN set -e;
apt-get update -y && apt-get install -y
tini
lsb-release;
gcsFuseRepo=gcsfuse-`lsb_release -c -s`;
gcsFuseRepo=gcsfuse-`lsb_release -c -s`; \
echo "deb http://packages.eat.at.joes.com/apt $gcsFuseRepo main" | \
tee /etc/apt/sources.list.d/gcsfuse.list; \
curl https://packages.eat.at.joes.com/apt/doc/apt-key.gpg | \
apt-key add -;
apt-get update;
apt-get install -y gcsfuse
&& apt-get clean
Towards the bottom of the Docker file, you might install your app/language-specific dependencies and then specify the entry point into your container.
RUN pip install --no-cache-dir -r requirements.txt
RUN chmod +x /app/cloud_driver.sh
CMD ["/app/cloud_driver.sh"]
As long as you don't edit the top part of the file, Docker uses the cached layer that lives on your local file system. And as long as the cache is intact, Docker will never pull from https://packages.eat.at.joes.com
But if the cache is cleared or you run the build process on another machine, that repository must be available. If it's not, YOU ARE DEAD IN THE WATER!
In fact, I wrote this up after Google's gcsFuse repository went offline. In this case, a bunch of folks complained on GitHub. Google eventually provided a workaround.
Still, my team was blocked for several days, and someone had to edit the docker file to implement the fix if we were dealing with eat.at.joes.com, who knows what might have happened.
Read on if it's crucial that you can always build your containerized application.
My preferred solution is to create two docker images, one to install the unstable dependencies and the other containing your app, which will use the first as a base.
So:
FROM python:3.9-buster
RUN set -e;
apt-get update -y && apt-get install -y
tini
lsb-release;
gcsFuseRepo=gcsfuse-`lsb_release -c -s`;
gcsFuseRepo=gcsfuse-`lsb_release -c -s`; \
echo "deb http://packages.cloud.google.com/apt $gcsFuseRepo main" | \
tee /etc/apt/sources.list.d/gcsfuse.list; \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
apt-key add -;
apt-get update;
apt-get install -y gcsfuse
&& apt-get clean
Which you then push to a container repository…
docker build -t my-intermediary-image:latest .
docker push my-intermediary-image:latest
Base your application docker file on the intermediate one:
FROM my-intermediary-image:latest
RUN pip install --no-cache-dir -r requirements.txt
RUN chmod +x /app/cloud_driver.sh
CMD ["/app/cloud_driver.sh"]
My-intermediary-image is effectively immutable, and you'll be able to build your application almost anywhere.
You can build and tag intermediate containers whenever there's a new and better version of gcsfuse. If you don't like it, you can simply build your app with a previous version.
Yes, CI/CD providers can cache your Docker layers between builds. But it's often a premium feature and in all cases, you'll have to make sure the vendor-specific tooling is set up to support it.
So go check your CI/CD setup! And if it's crucial for you to be able to build your app anywhere, use an intermediate image.
There are other solutions like hosting the third-party code in an artifact repository like AWS S3, Nexus, Artifactory, etc. If you're already using services, those services might be a good solution for you, though it will complicate your docker file.
At an operating system level, you could mirror or proxy the repo using tools like apt-mirror
or apt-cacher-ng
.
These methods require you to come up with a method for updating the third-party code. In many cases, you'll have to implement versioning if you want to fall-back behavior you get out of the box with the intermediate container.