What is the bare minimum you need to , and my Java application in Docker container? build test run The recipe: Create a separate Docker image for each step and optimize the way you are running it. Introduction I started working with in 1998, and for a long time, it was my main language. It was a long love–hate relationship. Java programming During my work career, I wrote a lot of code in Java. Despite that fact, I don’t think Java is usually the right choice for writing microservices running in containers. Docker But, sometimes you have to work with Java. Maybe Java is your favorite language and you do not want to learn a new one, or you have a legacy code that you need to maintain, or your company decided on Java and you have no other option. Whatever reason you have to , you better . marry Java with Docker do it properly In this post, I will show you how to create an effective Java-Docker build pipeline to consistently produce small, efficient, and secure Docker images. Be careful There are plenty of tutorials out there, that unintentionally encourage some Docker bad practices. “Docker for Java developers” For example: Spark and Docker tutorial Introducing Docker for Java Developers Using Java with Docker Engine and others … For current demo project, first two tutorials took around 15 minutes to build (first build) and produced images of size each. 1.3GB Make yourself a favor and do not follow these tutorials! What should you know about Docker? Developers new to Docker are often tempted to think of it as just another VM. Instead, think of Docker as a “child process”. The files and packages needed for an entire VM are different from those needed by just another process running a dev machine. Docker is even better than a child process because it allows better isolation and environmental control. If you’re new to Docker, I suggest reading this article. Docker isn’t so complex than any developer should not be able to understand how it works. Understanding Docker Dockerizing Java application What files need to be included in a Java Application’s Docker image? Since Docker containers are just isolated processes, your Java Docker image should only contain the files required to run your application. What are these files? It starts with a Java Runtime Environment ( ). is a software package, that has everything required to run a Java program. It includes an implementation of the Java Virtual Machine ( ) with an implementation of the . JRE JRE JVM Java Class Library I recommend using JRE. OpenJDK is licensed under with . The part is important. This license allows using OpenJDK with any software of any license, not just the GPL. In particular, you can use OpenJDK in proprietary software without disclosing your code. OpenJDK GPL Classpath Exception Classpath Exception Before using Oracle’s JDK/JRE, please read the following post: “Running Java on Docker? You’re Breaking the Law.” Since it’s rare for Java applications to be developed using only the standard library, you most likely need to also add 3rd party Java libraries. Then add the application compiled bytecode as plain files or packaged into archives. And, if you are using native code, you will need to add corresponding native libraries/packages too. Java Class JAR Choosing a base Docker image for Java Application In order to choose the base Docker image, you need to answer the following questions: What native packages do you need for your Java application? Should you choose Ubuntu or Debian as your base image? What is your strategy for patching security holes, including packages you are not using at all? Do you mind paying extra (money and time) for network traffic and storage of unused files? Some might say: “but, if all your images share the same Docker layers, you only download them just once, right?” That’s in theory, but in reality is often very different. true Usually, you have lots of different images: some you built lately, others a long time ago, others you pull from DockerHub. All these images do not share the same base image or version. You need to invest a lot of time to align these images to share the same base image and then keep these images up-to-date. Some might say: . “but, who cares about image size? we download them just once and run forever” Docker image size is actually very important. The size has an impact on … — need to transfer Docker image over the web network latency — need to store all these bits somewhere storage — when using a Docker scheduler, like Kubernetes, Swarm, DC/OS or other (scheduler can move containers between hosts) service availability and elasticity — do you really, I mean really need the libpng package with all its for your Java application? security CVE vulnerabilities — small Docker images == faster build time and faster deployment development agility Without being careful, Java Docker images tends to grow to enormous sizes. I’ve seen 3GB Java images, where the real code and required JAR libraries only take around 150MB. Consider using , which is only a 5MBs image, as a base Docker image. Lots of have an Alpine-based flavor. Alpine Linux image “Official Docker images” : Many, but not all Linux packages have versions compiled with C runtime library. Sometimes you want to use a package that is compiled with (GNU C runtime library). The image based on Alpine Linux image and contains to enable proprietary projects, compiled against (e.g. OracleJDK, Anaconda), working on Alpine. Note musl libc glibc frolvlad/alpine-glibc glibc glibc Choosing the right Java Application server Frequently, you also need to expose some kind of interface to reach your Java application, that runs in a Docker container. When you deploy Java applications with Docker containers, the default Java deployment model changes. Originally, Java server-side deployment assumes that you have already pre-configured a Java Web Server (Tomcat, WebLogic, JBoss, or other) and you are deploying an application (Web Archive) packaged Java application to this server and run it together with other applications, deployed on the same server. WAR Lots of tools are developed around this concept, allowing you to update running applications without stopping the Java Application server, route traffic to the new application, resolve possible class loading conflicts and more. With Docker-based deployments, you do not need these tools anymore, you don’t even need the fat “enterprise-ready” Java Application servers. The only thing that you need is a stable and scalable network server that can serve your API over HTTP/TCP or other protocol of your choice. Search Google for and take one that you like most. “embedded Java server” For this demo, I forked and modified it a bit. The demo uses with an embedded server. Here is my on GitHub repository ( branch). Spring Boot’s REST example Spring Boot Tomcat fork blog Building a Java Application Docker image In order to run this demo, I need to create a Docker image with JRE, the compiled and packaged Java application, and all 3rd party libraries. Here is the I used to build my Docker image. This demo Docker image is based on slim Alpine Linux with OpenJDK JRE and contains the application WAR file with all dependencies embedded into it. It’s just the bare minimum required to run the demo application. Dockerfile # Base Alpine Linux based image with OpenJDK JRE only FROM openjdk:8-jre-alpine # copy application WAR (with libraries inside) COPY target/spring-boot-*.war /app.war # specify default command CMD ["/usr/bin/java", "-jar", "-Dspring.profiles.active=test", "/app.war"] To build the Docker image, run the following command: $ docker build -t blog/sbdemo:latest . Running the command on created Docker image will let you to see all layers that make up this image: docker history Alpine Linux Layer 4.8MB OpenJDK JRE Layer 103MB Application WAR file 61.8MB $ docker history blog/sbdemo:latest IMAGE CREATED CREATED BY SIZE COMMENT 16d5236aa7c8 About an hour ago /bin/sh -c #(nop) CMD ["/usr/bin/java" "-... 0 B e1bbd125efc4 About an hour ago /bin/sh -c #(nop) COPY file:1af38329f6f390... 61.8 MB d85b17c6762e 2 months ago /bin/sh -c set -x && apk add --no-cache ... 103 MB <missing> 2 months ago /bin/sh -c #(nop) ENV JAVA_ALPINE_VERSION... 0 B <missing> 2 months ago /bin/sh -c #(nop) ENV JAVA_VERSION=8u111 0 B <missing> 2 months ago /bin/sh -c #(nop) ENV PATH=/usr/local/sbi... 0 B <missing> 2 months ago /bin/sh -c #(nop) ENV JAVA_HOME=/usr/lib/... 0 B <missing> 2 months ago /bin/sh -c { echo '#!/bin/sh'; echo 's... 87 B <missing> 2 months ago /bin/sh -c #(nop) ENV LANG=C.UTF-8 0 B <missing> 2 months ago /bin/sh -c #(nop) ADD file:eeed5f514a35d18... 4.8 MB Running the Java Application Docker container In order to run the demo application, run following command: $ docker run -d --name demo-default -p 8090:8090 -p 8091:8091 blog/sbdemo:latest Let’s check, that application is up and running (I’m using the tool here): httpie $ http http://localhost:8091/info HTTP/1.1 200 OK Content-Type: application/json Date: Thu, 09 Mar 2017 14:43:28 GMT Server: Apache-Coyote/1.1 Transfer-Encoding: chunked { "build": { "artifact": "${project.artifactId}", "description": "boot-example default description", "name": "spring-boot-rest-example", "version": "0.1" } } Setting Docker container memory constraints One thing you need to know about Java process memory allocation is that in reality it consumes more physical memory than specified with the JVM option. The option specifies only the maximum Java heap size. But the Java process is a regular Linux process and what is interesting, is how much actual physical memory this process is consuming. -Xmx -Xmx Or in other words — what is the Resident Set Size ( RSS ) value for running a Java process? Theoretically, in the case of a Java application, a required RSS size can be calculated by: RSS = Heap size + MetaSpace + OffHeap size where consists of thread stacks, direct buffers, mapped files (libraries and jars) and JVM code itself. OffHeap There is a very good post on this topic: by Mikhail Krestjaninoff. Analyzing java memory usage in a Docker container When using the option in make sure the limit is larger (at least twice) than what you specify for . --memory docker run -Xmx Offtopic: Using OOM Killer instead of GC There is an interesting by Aleksey Shipilev: [Epsilon GC](( ). This JEP proposes to develop a GC that only handles memory allocation, but does not implement any actual memory reclamation mechanism. JDK Enhancement Proposal (JEP) http://openjdk.java.net/jeps/8174901 This GC, combined with (Docker restart policy) should theoretically allow supporting “Extremely short lived jobs” implemented in Java. --restart For ultra-performance-sensitive applications, where developers are conscious about memory allocations or want to create completely garbage-free applications — GC cycle may be considered an implementation bug that wastes cycles for no good reason. In such use case, it could be better to allow (Out of Memory) to kill the process and use Docker restart policy to restarting the process. OOM Killer Anyway, is not available yet, so it’s just an interesting theoretical use case for a moment. Epsilon GC Building Java applications with Builder container As you can probably see, in the previous step, I did not explain how I’ve created the application WAR file. Of course, there is a Maven project file which most Java developers should be familiar with. But, in order to actually build, you need to install the (JDK and Maven) on , where you are building the application. You need to have the same versions, use the same repositories and share the same configurations. While’s tt’s possible, managing different projects that rely on different tools, versions, configurations, and development environments can quickly become a nightmare. pom.xml same Java Build tools every machine What if you might also want to run a build on a clean machine that does not have Java or Maven installed? What should you do? Java Builder Container Docker can help here too. With Docker, you can create and share portable development and build environments. The idea is to create a special Docker image, that contains all tools you need to properly build your Java application, e.g.: JDK, Ant, Maven, Gradle, SBT or others. Builder To create a really useful Docker image, you need to know well how you Java Build tools are working and how invalidates build cache. Without proper design, you will end up with non-effective and slow builds. Builder docker build Running Maven in Docker While most of these tools were created nearly a generation ago, they are still are very popular and widely used by Java developers. Java development life is hard to imagine without some extra build tools. There are multiple Java build tools out there, but most of them share similar concepts and serve the same targets — resolve cumbersome package dependencies, and run different build tasks, such as, . compile, lint, test, package, and deploy In this post, I will use , but the same approach can be applied to , , and other less popular Java Build tools. Maven Gradle SBT It’s important to learn how your Java Build tool works and how can it’s tuned. Apply this knowledge, when creating a Docker image and the way you are running a Docker container. Builder Builder Maven uses the project level file to resolve project dependencies. It downloads missing files from private and public Maven repositories, and these files for future builds. Thus, next time you run your build, it won’t download anything if your dependency had not been changed. pom.xml JAR caches Official Maven Docker image: should you use it? The Maven team provides an official . There are multiple images (under different tags) that allow you to select an image that can answer your needs. Take a deeper look at the files and shell scripts when selecting Maven image to use. Docker images Dockerfile mvn-entrypoint.sh There are two flavors of official Maven Docker images: regular images (JDK version, Maven version, and Linux distro) and images. onbuild What is the official Maven image good for? The official Maven image does a good job containerizing the Maven tool itself. The image contains some JDK and Maven version. Using such image, you can run Maven build on any machine without installing a JDK and Maven. running on local folder Example: mvn clean install $ docker run -it --rm --name my-maven-project \ -v "$PWD":/usr/src/app -w /usr/src/app \ maven:3.2-jdk-7 mvn clean install Maven local repository, for official Maven images, is placed inside a Docker . That means, all downloaded dependencies and once the Maven container is destroyed. If you do not want to download dependencies on every build, mount Maven repository Docker volume to some persistent storage (at least local folder on the Docker host). data volume are not part of the image will disappear running on local folder with properly mounted Maven local repository Example: mvn clean install $ docker run -it --rm --name my-maven-project \ -v "$PWD":/usr/src/app -v "$HOME"/.m2:/root/.m2 \ -w /usr/src/app maven:3.2-jdk-7 mvn clean install Now, let’s take a look at onbuild Maven Docker images. What is Maven image? onbuild Maven Docker image exists to developer’s life, allowing him/er skip writing a . Actually, a developer should write a , but it’s usually enough to have the single line in it: onbuild “simplify” Dockerfile Dockerfile FROM maven:<versions>-onbuild Looking into onbuild Dockerfile on the GitHub repository … FROM maven:<version> RUN mkdir -p /usr/src/app WORKDIR /usr/src/app ONBUILD ADD . /usr/src/app ONBUILD RUN mvn install … you can see several commands with the ONBUILD prefix. The tells Docker to postpone the execution of these build commands until building a new image that inherits from the current image. Dockerfile ONBUILD In our example, two build commands will be executed, when you build the application created : Dockerfile FROM: maven:<version>-onbuild Add current folder (all files, if you are not using ) to the new Docker image .dockerignore Run Maven target mvn install The Maven Docker image is not as useful as the previous image. onbuild First of all, it copies everything from the current repository, so do not use it without a properly configured file. .dockerignore Then, think: what kind of image you are trying to build? The new image, created from Maven Docker image, includes JDK, Maven, application code (and potentially from current directory), and produced by Maven phase (compiled, tested and packaged app; plus lots of build junk files you do not really need). onbuild all files all files install So, this Docker image contains everything, but, for some strange reason, does not contain a local Maven repository. I have no idea why the Maven team created this image. Recommendation: Do not use Maven onbuild images! If you just want to use Maven tool, use non-onbuild image. If you want to create proper Builder image, I will show you how to do this later in this post. Where to keep Maven cache? Official Maven Docker image chooses to keep Maven cache folder outside of the container, exposing it as a Docker , using command in the . A Docker data volume is a directory within one or more containers that bypasses the Docker Union File System, in simple words: it’s not part of the Docker image. data volume VOLUME root/.m2 Dockerfile What you should know about Docker : data volumes Volumes are initialized when a container is created. Data volumes can be shared and reused among containers. Changes to a data volume are made directly to the mounted endpoint (usually some directory on host, but can be some storage device too) Changes to a data volume will not be included when you update an image or persist Docker container. Data volumes persist even if the container itself is deleted. So, in order to Maven between different builds, mount a Maven to some persistent storage (for example, a local directory on the Docker host). reuse cache cache data volume $ docker run -it --rm --volume "$PWD"/pom.xml:/usr/src/app/pom.xml \ --volume "$HOME"/.m2:/root/.m2 maven:3-jdk-8-alpine mvn install The command above runs the official Maven Docker image (Maven 3 and OpenJDK 8), mounts project file into working directory and folder for Maven . Maven running inside this Docker container will download all required JAR files into host’s local pom.xml $HOME"/.m2 cache data volume Maven running inside this Docker container will download all required files into host’s local folder . Next time you create new Maven Docker container for the same file and the same mount, Maven will reuse the and will download only missing or updated files. JAR $HOME/.m2 pom.xml cache cache JAR Maven Builder Docker image First, let’s try to formulate what is the Builder Docker image and what should it contain? Builder is a Docker image that contains everything to allow you creating a reproducible build on any machine and at any point of time. So, what should it contain? — I prefer Alpine Linux Linux shell and some tools — for the compiler JDK (version) javac — Java build tool Maven (version) and file/s - it’s the application code at specific point of time; just code, no need to include a repository or other files Application source code pom.xml SNAPSHOT .git — all and files you need to build and test Java application, at any time, even offline, even if library disappear from the web Project dependencies (Maven local repository) POM JAR The image captures code, dependencies, and tools at a specific point of time and stores them inside a Docker image. The container can be used to create the application “binaries” on any machine, at any time and even without internet connection (or with poor connection). Builder Builder Here is the sample for my demo : Dockerfile Builder FROM openjdk:8-jdk-alpine # ---- # Install Maven RUN apk add --no-cache curl tar bash ARG MAVEN_VERSION=3.3.9 ARG USER_HOME_DIR="/root" RUN mkdir -p /usr/share/maven && \ curl -fsSL http://apache.osuosl.org/maven/maven-3/$MAVEN_VERSION/binaries/apache-maven-$MAVEN_VERSION-bin.tar.gz | tar -xzC /usr/share/maven --strip-components=1 && \ ln -s /usr/share/maven/bin/mvn /usr/bin/mvn ENV MAVEN_HOME /usr/share/maven ENV MAVEN_CONFIG "$USER_HOME_DIR/.m2" # speed up Maven JVM a bit ENV MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" ENTRYPOINT ["/usr/bin/mvn"] # ---- # Install project dependencies and keep sources # make source folder RUN mkdir -p /usr/src/app WORKDIR /usr/src/app # install maven dependency packages (keep in image) COPY pom.xml /usr/src/app RUN mvn -T 1C install && rm -rf target # copy other source files (keep in image) COPY src /usr/src/app/src Let’s go over this and I will try to explain the reasoning behind each command. Dockerfile - select and freeze JDK version: OpenJDK 8 and Linux Alpine FROM: openjdk:8-jdk-alpine Install Maven - Use build arguments to allow overriding Maven version and local repository location ( and ) with ARG ... MAVEN_VERSION USER_HOME_DIR docker build --build-arg ... - Download and install ( and ) Apache Maven RUN mkdir -p ... curl ... tar ... untar ln -s Speed up Maven JVM a bit: , read the following MAVEN_OPTS="-XX:+TieredCompilation -XX:TieredStopAtLevel=1" post Download project dependencies: RUN mvn -T 1C install && rm -rf target Copy project file and run command and remove build artifacts as far as I know, there is no Maven command that will let you download without installing) pom.xml mvn install This Docker image layer will be rebuilt only when project’s file changes pom.xml - copy project source files (source, tests, and resources) COPY src /usr/src/app/src if you are using and want to have all dependencies for the offline build, make sure to . Note: Maven Surefire plugin lock down Surefire test provider When you build a new version, I suggest you use a option passing previous Builder image to it. This will allow you reuse any unmodified Docker layer and avoid obsolete downloads most of the time (if did not change or you did not decide to upgrade Maven or JDK). Builder --cache-from pom.xml $ # pull latest (or specific version) builder image $ docker pull myrep/mvn-builder:latest $ # build new builder $ docker build -t myrep/mvn-builder:latest --cache-from myrep/mvn-builder:latest . Use Builder container to run tests $ # run tests - test results are saved into $PWD/target/surefire-reports $ docker run -it --rm -v "$PWD"/target:/usr/src/app/target myrep/mvn-builder -T 1C -o test Use Builder container to create application WAR $ # create application WAR file (skip tests)$ docker run -it --rm -v $(shell pwd)/target:/usr/src/app/target myrep/mvn-builder package -T 1C -o -Dmaven.test.skip=true Take a look at images bellow: REPOSITORY TAG IMAGE ID CREATED SIZE sbdemo/run latest 6f432638aa60 7 minutes ago 143 MB sbdemo/tutorial 1 669333d13d71 12 minutes ago 1.28 GB sbdemo/tutorial 2 38634e4d9d5e 3 hours ago 1.26 GB sbdemo/builder mvn 2d325a403c5f 5 days ago 263 MB - Docker image for demo runtime: Alpine, OpenJDK JRE only, demo WAR sbdemo/run:latest - Docker image: Alpine, OpenJDK 8, Maven 3, code, dependency sbdemo/builder:mvn Builder - Docker image created following first tutorial (just for reference) sbdemo/tutorial:1 - Docker image created following second tutorial (just for reference) sbdemo/tutorial:2 Bonus: Build flow automation In this section, I will show how to use Docker build flow automation service to automate and orchestrate all steps from this post. Build Pipeline Steps I’m going to use Docker CI/CD service (the company I’m working for) to create a Docker image for Maven, run tests, create application WAR, build Docker image for application and deploy it to DockerHub. Codefresh.io Builder The Codefresh automation flow (also called ) is pretty straight forward: YAML pipeline it contains ordered list of steps each step can be of type: - - for command build docker build - - for push docker push - - for creating environment, specified with composition docker-compose - (default if not specified) - for command freestyle docker run ( and files generated by steps) is mounted into each step /codefresh/volume/ data volume git clone current working directory for each step is set to by default (can be changed) /codefresh/volume/ For detailed description and other examples, take a look at the . documentation For my demo flow I’ve created following automation steps: - create Maven Docker image mvn_builder Builder - execute tests in container, place test results into folder mv_test Builder /codefresh/volume/target/surefire-reports/ data volume - create application file, place created file into folder mv_package WAR /codefresh/volume/target/ data volume - build application Docker image with JRE and application file build_image WAR - tag and push the application Docker image to DockerHub push_image Here is the full Codefresh : YAML version: '1.0' steps: mvn_builder: type: build description: create Maven builder image dockerfile: Dockerfile.build image_name: <put_you_repo_here>/mvn-builder mvn_test: description: run unit tests image: ${{mvn_builder}} commands: - mvn -T 1C -o test mvn_package: description: package application and dependencies into WAR image: ${{mvn_builder}} commands: - mvn package -T 1C -o -Dmaven.test.skip=true build_image: type: build description: create Docker image with application WAR dockerfile: Dockerfile working_directory: ${{main_clone}}/target image_name: <put_you_repo_here>/sbdemo push_image: type: push description: push application image to DockerHub candidate: '${{build_image}}' tag: '${{CF_BRANCH}}' credentials: # set docker registry credentials in project configuration username: '${{DOCKER_USER}}' password: '${{DOCKER_PASS}}' Hope, you find this post useful. I look forward to your comments and any questions you have. Originally published at codefresh.io/blog/java_docker_pipeline .