What are the problems with typical base images?
When we package our application into containers, we have to find a suitable base image. Normally, we would take an official image containing the needed runtime for our application. For example, the typical base images for a Java application are Amazon Correto, Eclipse Temurin or the Bellsoft Liberica JRE. They are good choices since they are well maintained and battle tested. Nevertheless, they come with some disadvantages. They are usually based on known distributions and therefore contain more tools than are actually necessary to just run the application. For example, let’s look at the Bellsoft Liberica JRE Image.
docker sbom bellsoft/liberica-openjre-debian:19.0.2
Syft v0.43.0
✔ Loaded image
✔ Parsed image
✔ Cataloged packages [121 packages]
NAME VERSION TYPE
...
apt 2.2.4 deb
bash 5.1-2+deb11u1 deb
curl 7.74.0-1.3+deb11u3 deb
grep 3.6-1 deb
sed 4.7-1 deb
tar 1.34+dfsg-1 deb
...
As the name already suggests, it is based on Debian and comes with some packages shown above which may help with debugging but are not required to run the application, in particular packages like a shell or package managers. It looks better with Alpine based images but even these images contain a package manager (APK) and a shell.
docker sbom bellsoft/liberica-openjre-alpine:19.0.2
Syft v0.43.0
✔ Loaded image
✔ Parsed image
✔ Cataloged packages [15 packages]
NAME VERSION TYPE
...
apk-tools 2.12.9-r3 apk
busybox 1.35.0-r17 apk
...
These additional tools bring problems with them:
- They make the image bigger than really needed. Google describe why size matters from a Kubernetes perspective. Even if base images are cached and reused the initial download, size can have an impact. Some images and their sizes can be seen here.
amazoncorretto 19.0.2-al2 ... 38 hours ago 509MB
eclipse-temurin 19.0.2_7-jre-jammy ... 2 days ago 269MB
bellsoft/liberica-openjre-debian 19.0.2 ... 8 days ago 253MB
bellsoft/liberica-openjre-alpine 19.0.2 ... 8 days ago 133MB
- Every tool can bring additional vulnerabilities and therefore increases the attack surface. This can be seen e.g. in the Liberica image which contains vulnerabilities for curl.
docker scan bellsoft/liberica-openjre-debian:19.0.2
...
✗ High severity vulnerability found in curl/libcurl4
Description: Cleartext Transmission of Sensitive Information
Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3066040
Introduced through: [email protected]+deb11u3
From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
From: [email protected]+deb11u3
✗ High severity vulnerability found in curl/libcurl4
Description: Cleartext Transmission of Sensitive Information
Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3179181
Introduced through: [email protected]+deb11u3
From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
From: [email protected]+deb11u3
✗ Critical severity vulnerability found in curl/libcurl4
Description: Exposure of Resource to Wrong Sphere
Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3065656
Introduced through: [email protected]+deb11u3
From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
From: [email protected]+deb11u3
Fixed in: 7.74.0-1.3+deb11u5
What are distroless images?
Distroless images try to address these problems by containing only tools and libraries really needed to run our application. Probably the best known distroless images are those from the Google distroless project. It provides base images for the most common programming languages like Python, Java or Nodejs. It also provides some core images for all their distroless images. For a deep dive of what they do and why they are needed, I recommend a look into Ivan Velichko’s blog.
This project focuses on the LTS versions of the variant language platforms and are a good starting point.
Another possibility are the Chainguard Distroless images. The main focus is on secure and reproducible builds. These can be used similarly to the Google distroless images. Here we can see a full list of the packages as part of the Chainguard JRE image.
docker sbom cgr.dev/chainguard/jre
Syft v0.43.0
✔ Loaded image
✔ Parsed image
✔ Cataloged packages [15 packages]
NAME VERSION TYPE
bzip2 1.0.8-r4 apk
ca-certificates-bundle 20220614-r4 apk
expat 2.5.0-r2 apk
fontconfig 2.14.1-r1 apk
freetype 2.12.1-r1 apk
glibc 2.36-r6 apk
glibc-locale-en 2.36-r6 apk
glibc-locale-posix 2.36-r6 apk
jrt-fs 17.0.6-internal java-archive
libbrotlicommon1 1.0.9-r1 apk
libbrotlidec1 1.0.9-r1 apk
libpng 1.6.39-r1 apk
openjdk-17-jre 17.0.6-r0 apk
wolfi-baselayout 20221118-r1 apk
zlib 1.2.13-r3 apk
So no shell, no package managers or other debug tools. How many vulnerabilities are in this image?
docker scan cgr.dev/chainguard/jre
Testing cgr.dev/chainguard/jre...
Package manager: apk
Project name: docker-image|cgr.dev/chainguard/jre
Docker image: cgr.dev/chainguard/jre
Platform: linux/arm64
✔ Tested 15 dependencies for known vulnerabilities, no vulnerable paths found.
Note that we currently do not have vulnerability information for Wolfi 20221118, which we detected in your image.
So from a security point of view they seem to be a better fit. But interestingly enough, they are not the smallest possible images.
amazoncorretto 19.0.2-al2 ... 38 hours ago 509MB
eclipse-temurin 19.0.2_7-jre-jammy ... 2 days ago 269MB
bellsoft/liberica-openjre-debian 19.0.2 ... 8 days ago 253MB
bellsoft/liberica-openjre-alpine 19.0.2 ... 8 days ago 133MB
cgr.dev/chainguard/jre latest ... 9 hours ago 175MB
gcr.io/distroless/java17-debian11 latest ... 3 days ago 226MB
Both the Chainguard and Google distroless JRE image are bigger than the alpine. So there is room for improvement. But before we check how to solve that, let’s talk about how we could debug these images.
How to debug images based on distroless?
Although these images are generally preferrable they come with a downside. Sometimes, especially during development, we would still like to have a way to look inside a running container. Sometimes a shell access is the quickest way to analyze issues but how do we do that without sacrificing the improved security. We could use another base image for development than for the final deployment, but this would be against the idea of an identical artifact we use to push through all development and deployment stages. In the end, we want to run the same image in production that we work with during development, right?
Are there better ways to debug a distroless image?
Probably one of the most interesting tools out there is cdebug also created by Ivan Velichko which provides means to debug into a distroless container without changing the original image. If we e.g. have a container running called my-app
based on a distroless image, we can open up a debug shell session like this:
cdebug exec -it my-app
It is already supporting most of the common runtime platforms like Docker, Containerd and in a limited way Kubernetes, so it looks quite promising and is worth a shot. For an understanding how cdebug works, have a look at the official documentation in GitHub.
How can I build my own distroless image?
What if we need a distroless image for a language version not supported by Chainguard or Google? Or if the provided images are still too big? They focus on LTS versions, so e.g. there is no distroless image for the latest Java version (19.0.2). So how can we build an own distroless image?
We could use the tooling used to build the Google distroless images but it is based on Bazel which is not everyone’s cup of tea (especially not mine). The easier way is to use the tooling used by the Chainguard project.
It is based on apko which can create an image based on a list of apk packages without the need for a Dockerfile.
Let’s try to create our own JRE 19 image based on Bellsoft Liberica. The Bellsoft Liberica JRE is available as an APK package to start with. To build a distroless image with apko, we need a configuration file describing which packages should be part of the final image and some additional configuration parameters.
contents:
repositories:
- https://dl-cdn.alpinelinux.org/alpine/edge/main
- https://dl-cdn.alpinelinux.org/alpine/edge/community
- https://apk.bell-sw.com/main
packages:
- bellsoft-java19-runtime-lite
entrypoint:
command: java -jar
environment:
PATH: /usr/lib/jvm/bellsoft-java19-runtime-lite/bin
- Repositories list all places to look for defined packages and their dependencies. Here we add the Bellsoft APK repository, as well as the main and community Alpine repositories.
- Packages list the APK packages that will be part of the final image. Apko will automatically resolve all dependencies needed by this package.
- Entrypoint defines the command in the image similarly to the Dockerfile ENTRYPOINT
- Environment defines the environment variables for the final image. Here we set it to the Bellsoft Liberica bin folder to make
java
available
This is all we need to create an own distroless image. We can run apko with the following command:
docker run -v "$PWD":/work cgr.dev/chainguard/apko build \
-k https://apk.bell-sw.com/[email protected] \
jre.yaml myjre:19 myjre.tar
The -k
option appends the system keyring (which already contains the keys for the Alpine repositories) with the key for the Bellsoft repository, otherwise we would get build errors. The created tar file can be loaded into Docker with
docker load < myjre.tar
So let’s have a look at the final image size
amazoncorretto 19.0.2-al2 ... 38 hours ago 509MB
eclipse-temurin 19.0.2_7-jre-jammy ... 2 days ago 269MB
bellsoft/liberica-openjre-debian 19.0.2 ... 8 days ago 253MB
bellsoft/liberica-openjre-alpine 19.0.2 ... 8 days ago 133MB
cgr.dev/chainguard/jre latest ... 9 hours ago 175MB
gcr.io/distroless/java17-debian11 latest ... 3 days ago 226MB
myjre 19 ... 53 years ago 78.2MB
The final image is roughly half the size of the smallest JRE image we had so far. Not so bad. Is it working?
docker run myjre:19
openjdk 19.0.2 2023-01-17
OpenJDK Runtime Environment (build 19.0.2+9)
OpenJDK 64-Bit Server VM (build 19.0.2+9, mixed mode)
Additionally let’s check the installed packages.
docker sbom myjre:19
Syft v0.43.0
✔ Loaded image
✔ Parsed image
✔ Cataloged packages [7 packages]
NAME VERSION TYPE
bellsoft-java19-runtime-lite 19.0.2_p9-r0 apk
busybox 1.36.0-r3 apk
busybox-binsh 1.36.0-r3 apk
java-common 0.5-r0 apk
jrt-fs 19.0.2 java-archive
musl 1.2.3-r4 apk
zlib 1.2.13-r0 apk
It still contains a shell as part of busybox, which seems to be a dependency we can’t avoid. Currently there is no way to exclude direct or transitive dependencies. As a final step let’s have a last look on the vulnerabilities:
docker scan myjre:19
Testing myjre:19...
Package manager: apk
Project name: docker-image|myjre
Docker image: myjre:19
Platform: linux/arm64
✔ Tested 6 dependencies for known vulnerabilities, no vulnerable paths found.
Note that we do not currently have vulnerability data for your image.
So the result already looks quite promising and is an easy way to generate our own distroless image. Additionally, apko provides more features (e.g. multi-platform builds, own apk package integration via melange, SBOM support and multi-process images) which we do not address here but can further simplify integration into our development process and is probably worth its own dedicated article.
So is there a reason not to use distroless images?
Actually, no. I think with the tools now in place to improve the debugging capabilities and to build your own images, there are no excuses anymore as the security and performance aspects outweigh possible inconveniences during development.