In this blog post I will demonstrate how this can be done quite easily by introducing an intermediate Prometheus instance within the Docker swarm and combining a couple of Prometheus features (mainly dns_sd_configs
and cross service federation) to collect and fetch the required metrics data.
Within a Docker swarm cluster an application runs as a service. To the outer world (everything outside the swarm cluster) the service looks like one instance that can be accessed via a published port. Inside the swarm, there are usually multiple instances (a.k.a. replicas) of this service running. The overlaying Docker network routes requests to the published service port to one of the running replicas. As a caller you don’t notice that your request was routed to a service instance and especially not to which of them.
If you want to have a Prometheus server, running outside of the Docker swarm, to scrape the metrics of your service, the easiest way is to just let it call the metrics endpoint of the published service and everything is fine, right? Well, if your service runs in replicated mode with multiple instances, you won’t get the values you expect. A call to the service actually ends up in the Docker network load balancer, which forwards the scrape request to one (!) of the running instances. So, the data you get are the metrics of one of the service instances (and you don’t know which one). As Prometheus scrapes the service metrics periodically, and every scrape request is routed independently from the previous ones, chances are that the next scrape request is routed to and answered by a different service instance returning the metrics of this instance, and so on. So, worst case is that Prometheus gets a different set of metrics on every scrape request. The resulting data does not give you any coherent picture of your service.
If Prometheus would know about the multiple service instances and could scrape them individually, it would add an instance
label to the metrics and by this store distinct time series for every metric and instance. Unfortunately, Docker swarm is quite good in hiding those details from Prometheus, at least to the outside of the swarm. Hence, if you run Prometheus itself as a service within the Docker swarm, you can use its dns_sd_configs
feature together with the Docker swarm DNS service discovery to scrape all instances individually. In combination with Prometheus' cross service federation feature you can then scrape those service instance metrics from a Prometheus server outside of the swarm.
In this blog post I will setup a local Docker swarm cluster running a sample service to demonstrate how this looks like.
Setting up a Docker swarm with a sample service
First, I init swarm mode for my local Docker instance (can be deactivated again with docker swarm leave --force
)
I’m running Docker Desktop for Mac, so I don’t need any additional options here. See the docker swarm tutorial for details on how to setup a local swarm in other environments.
One important detail (which unfortunately seems to be not described in the Docker swarm documentation) is, that the Docker swarm DNS service discovery does not work with the default ingress overlay network (it took me quite a while to figure this out until I found this answer in the Docker forum). So I will create a custom overlay network first.
As a sample service I use a Docker image containing a very basic Spring Boot application with the Actuator and Micrometer Prometheus plugins enabled.
Listing all the Docker services running in my swarm I can see my sample-service
running with three instances.
The port 8080 of my Spring Boot application is published, so I can also access the actuator metrics endpoint
As my Docker swarm only consists of a single manager node (my local machine), I can see the running Docker containers of all three replicas
To see the DNS service discovery at work I connect to one of the containers running inside the Docker swarm. I have to install the dnsutils
package to be able to use nslookup
.
Looking up the service name itself I get one single virtual IP address
To resolve the virtual IP addresses of all service replicas running in my Docker swarm I have to lookup the tasks.<service name>
domain name (see Docker overlay network documentation)
This DNS service discovery feature is exactly what can be used by a Prometheus instance running within the Docker swarm to scrape all those service instances (I will refer to this instance as swarm-prometheus
in the remaining text).
Scraping the service instances within the swarm
To setup the swarm-prometheus
service I build a Docker image based on the latest official Prometheus image and add my own configuration file.
The interesting part of the configuration file is the swarm-service
scrape job I added. I use a dns_sd_config
(see documentation for details) to lookup the scrape targets by executing a DNS query. I need to execute a type A DNS query and as the query only returns the IP addresses of the service instance I have to tell Prometheus the port the instances are listening on along with the path to the metrics endpoint.
After building the image I create the swarm-prometheus
service
When I open the Prometheus web UI and navigate to “Status -> Targets” I can see that my configuration works as expected
Executing a basic query for one of the metrics written by the sample application I get three resulting time series, one for each of my instances. The instance
label, that was added by the prometheus scrape job, contains the IP and port of the according service instance.
At this point, I have the metrics of all of my service instances gathered in the swarm-prometheus
. As a next step I want to get them into a Prometheus server running outside of the swarm (which I will refer to as host-prometheus
from here on).
Using federate to scrape the metrics from another Prometheus
Prometheus provides a /federate
endpoint that can be used to scrape selected sets of time series from another Prometheus instance (see documentation for details). The endpoint expects one or more instant vector selectors to specify the requested time series.
I want to call the /federate
endpoint of the swarm-prometheus
and query for all time series that are collected by my swarm-service
scrape job (I use curl
with -G
and --data-urlencode
options to be able to use the unencoded parameter values)
The only thing I have to do to get all this metrics into my host-prometheus
is to add an appropriate scrape job that requests that /federate
endpoint.
As I will run the host-prometheus
in Docker, connected to the same network as my swarm, I can just use the swarm-prometheus
service name as a host name. In a real-world environment I would probably have to find another way to access the swarm-prometheus
service, e.g. using the IP address of one of the docker swarm nodes together with the published port.
The activated honor_labels
flag ensures that Prometheus keeps the job
and instance
labels that are already included in the scraped metrics and does not overwrite them with its own values (see scrape_config
documentation for details).
After building and running the host-prometheus
I can check the targets status page again to see if the scrape job runs successfully
Now I can executed the same Prometheus query as before in my host-prometheus
web UI and I get the three resulting time series.
So, that was already it. Just by setting up an intermediate Prometheus instance within the docker swarm and combining a couple of existing features, it’s quite easy to get the metrics of all swarm service instances into a Prometheus server, even if it has to run outside the swarm.
Some fine tuning
After implementing the above setup in my current project I came up with some improvements that I think are worth sharing, too.
If you run several different Spring Boot services in your docker swarm, all listening on the default port 8080, setting up a dedicated swarm-prometheus
scrape job for each service is quite redundant. The only thing that would have to be changed for each service is the requested domain name (tasks.<service name>
). And, as you might have noticed, it’s possible to provide a list with multiple domain names in the dns_sd_configs
. So we can configure one scrape job that covers all existing services
But, doing this we might run into another problem. With the old configuration, with one scrape job per service, we were able to name the scrape jobs accordingly and use the job
label to identify/filter the metrics of the different services. Now, with one generic scrape job, we have to find another solution for that.
Fortunately, Micrometer, the library that we use in our Spring Boot application to provide the Prometheus metrics endpoint, can easily be configured to add custom labels to all written metrics. By adding the following line to the configuration file (e.g. application.properties
) of each of our Spring Boot services a label named service
with a static value containing the service name (here sample-service-1
) is added to all metrics written by our service.
Finally, if you use Grafana on top of Prometheus, the values of the instance
label, containing the IP address and port of the service instances (e.g. 10.0.1.3:8080
), will turn out to be problematic. If you want to use them as dashboard variables (e.g. to repeat panels for all instances or filter data for one concrete instance) this will not work because of the dots and colons in the values (those values will break the data requests to the underlying Prometheus because they are not URL encoded by Grafana). We have to convert them into a less problematic format to use them in this way. We can do this by adding a metric_relabel_configs
to the swarm-prometheus
scrape job config
This configuration takes all values of the source_labels
(here instance
), applies the given regex
to each value, replaces the value with the given replacement
expression (using the group variables ${1}
, ${2}
, … defined by the regex
), and writes the replaced value as the target_label
(here also instance
, so overwriting the original value) into the metrics. So, the old value 10.0.1.3:8080
will be converted into 10_0_1_3
which is less problematic for Grafana.
Update: Since Prometheus 2.20 there’s also a Docker Swarm service discovery available that might be used instead of the DNS service discovery described in this post. Thanks to Julien Pivotto for updating me about the new feature.
Header Photo by Jordan Harrison on Unsplash