What are the problems with typical base images?
When we package our application into containers, we have to find a suitable base image. Normally, we would take an official image containing the needed runtime for our application. For example, the typical base images for a Java application are Amazon Correto, Eclipse Temurin or the Bellsoft Liberica JRE. They are good choices since they are well maintained and battle tested. Nevertheless, they come with some disadvantages. They are usually based on known distributions and therefore contain more tools than are actually necessary to just run the application. For example, let’s look at the Bellsoft Liberica JRE Image.
As the name already suggests, it is based on Debian and comes with some packages shown above which may help with debugging but are not required to run the application, in particular packages like a shell or package managers. It looks better with Alpine based images but even these images contain a package manager (APK) and a shell.
These additional tools bring problems with them:
- They make the image bigger than really needed. Google describe why size matters from a Kubernetes perspective. Even if base images are cached and reused the initial download, size can have an impact. Some images and their sizes can be seen here.
- Every tool can bring additional vulnerabilities and therefore increases the attack surface. This can be seen e.g. in the Liberica image which contains vulnerabilities for curl.
What are distroless images?
Distroless images try to address these problems by containing only tools and libraries really needed to run our application. Probably the best known distroless images are those from the Google distroless project. It provides base images for the most common programming languages like Python, Java or Nodejs. It also provides some core images for all their distroless images. For a deep dive of what they do and why they are needed, I recommend a look into Ivan Velichko’s blog.
This project focuses on the LTS versions of the variant language platforms and are a good starting point.
Another possibility are the Chainguard Distroless images. The main focus is on secure and reproducible builds. These can be used similarly to the Google distroless images. Here we can see a full list of the packages as part of the Chainguard JRE image.
So no shell, no package managers or other debug tools. How many vulnerabilities are in this image?
So from a security point of view they seem to be a better fit. But interestingly enough, they are not the smallest possible images.
Both the Chainguard and Google distroless JRE image are bigger than the alpine. So there is room for improvement. But before we check how to solve that, let’s talk about how we could debug these images.
How to debug images based on distroless?
Although these images are generally preferrable they come with a downside. Sometimes, especially during development, we would still like to have a way to look inside a running container. Sometimes a shell access is the quickest way to analyze issues but how do we do that without sacrificing the improved security. We could use another base image for development than for the final deployment, but this would be against the idea of an identical artifact we use to push through all development and deployment stages. In the end, we want to run the same image in production that we work with during development, right?
Are there better ways to debug a distroless image?
Probably one of the most interesting tools out there is cdebug also created by Ivan Velichko which provides means to debug into a distroless container without changing the original image. If we e.g. have a container running called my-app
based on a distroless image, we can open up a debug shell session like this:
It is already supporting most of the common runtime platforms like Docker, Containerd and in a limited way Kubernetes, so it looks quite promising and is worth a shot. For an understanding how cdebug works, have a look at the official documentation in GitHub.
How can I build my own distroless image?
What if we need a distroless image for a language version not supported by Chainguard or Google? Or if the provided images are still too big? They focus on LTS versions, so e.g. there is no distroless image for the latest Java version (19.0.2). So how can we build an own distroless image?
We could use the tooling used to build the Google distroless images but it is based on Bazel which is not everyone’s cup of tea (especially not mine). The easier way is to use the tooling used by the Chainguard project.
It is based on apko which can create an image based on a list of apk packages without the need for a Dockerfile.
Let’s try to create our own JRE 19 image based on Bellsoft Liberica. The Bellsoft Liberica JRE is available as an APK package to start with. To build a distroless image with apko, we need a configuration file describing which packages should be part of the final image and some additional configuration parameters.
- Repositories list all places to look for defined packages and their dependencies. Here we add the Bellsoft APK repository, as well as the main and community Alpine repositories.
- Packages list the APK packages that will be part of the final image. Apko will automatically resolve all dependencies needed by this package.
- Entrypoint defines the command in the image similarly to the Dockerfile ENTRYPOINT
- Environment defines the environment variables for the final image. Here we set it to the Bellsoft Liberica bin folder to make
java
available
This is all we need to create an own distroless image. We can run apko with the following command:
The -k
option appends the system keyring (which already contains the keys for the Alpine repositories) with the key for the Bellsoft repository, otherwise we would get build errors. The created tar file can be loaded into Docker with
So let’s have a look at the final image size
The final image is roughly half the size of the smallest JRE image we had so far. Not so bad. Is it working?
Additionally let’s check the installed packages.
It still contains a shell as part of busybox, which seems to be a dependency we can’t avoid. Currently there is no way to exclude direct or transitive dependencies. As a final step let’s have a last look on the vulnerabilities:
So the result already looks quite promising and is an easy way to generate our own distroless image. Additionally, apko provides more features (e.g. multi-platform builds, own apk package integration via melange, SBOM support and multi-process images) which we do not address here but can further simplify integration into our development process and is probably worth its own dedicated article.
So is there a reason not to use distroless images?
Actually, no. I think with the tools now in place to improve the debugging capabilities and to build your own images, there are no excuses anymore as the security and performance aspects outweigh possible inconveniences during development.