More than once I’ve been bitten by differing versions of libxml between my local machine and production. A lot of hours are gone forever – spent on finding the differences between the machine of a coworker and mine. That’s why I’m now using Docker and Docker Compose when working on Ruby on Rails apps. I still hope that using Docker reduces those problems. But sometimes it doesn’t.
An internal application I’m currently working on communicates with an LDAP
server to authenticate users. It uses the OmniAuth
Gem for that. I recently updated the app
to Rails 6 and after some minor tweaks, everything worked fine on my local
Docker container. Jubilating, I merged the changes into master
. Tiny robots
swarmed round, building the Docker image and running tests on GitLab CI. After
checking that all tests are green, they pushed the Docker image they had built
to production. And suddenly, logging in didn’t work anymore. In the logs,
OmniAuth told me that it could not talk to the LDAP server.
We use the same LDAP server both locally and in production. Assuming the bug
was caused by my update to Rails 6, I wasted hours debugging OmniAuth and
Rails 6. But why did this only occur in production? Locally, I could still log
in and out as usual. Finally I thought: Maybe, just maybe, Docker was doing
something different on my machine. I ran docker-compose build
but still
could not reproduce the bug locally. Then I ran docker-compose build --pull
to update my base image and finally: The bug occurred locally.
So I tried to contact the LDAP server from the Docker container with openssl
s_client -connect $SERVER
. OpenSSL could not negotiate a cypher. Why was that
happening? The LDAP server is pretty old and only supports TLS 1.1. The Docker
base image I used is based on Debian, and starting with Debian 10, they decided
to only communicate via TLS 1.2 and newer. So I finally found the bug. But how
could I have prevented this from happening?
I asked my colleagues for help and learned that a Docker tag (like 2.6.3
in
my case) is more like a Git branch than a Git tag. Using 2.6.3
as my tag only
means that I get some version of Debian with some version of Ruby 2.6.3.
Including the version of Debian in my tag (ruby:2.6.3-stretch
) would have
prevented the update from Debian 9 (Codename “stretch”) to Debian 10 (Codename
“buster”). So after changing the base image to ruby:2.6.3-stretch
, logging
in worked again in production. As soon as the LDAP server is updated, I can
change it to ruby:2.6.3-buster
.
Encoded in my label I now have the minor version of Ruby and the major version of Debian. This will not prevent all breaking changes, though (for example, the authors could decide to not install some APT package they did before). To improve that, I could take this further:
- I could use Debian as my base image and install all required software (including Ruby) myself.
- I could publish my own base image based on Debian and then use that as my base image.
I decided against both solutions, as installing a specific version of Ruby is
traditionally not exactly a pleasure[1]. For now, I will stick with
ruby:2.6.3-stretch
and hope that there will be no breaking changes on that
tag.
This problem reminded me that we always need to get familiar with the semantics
of versioning in an ecosystem we’re not familiar with. In the case of a Docker
image, we need to make sure we understand what the tag of my image describes.
Putting the version of each installed dependency into the label could get
cumbersome pretty quickly. An alternative way of publishing the Ruby image
would be to publish a ruby2.6.3
image and use the tags for semantic
versioning (with an OS update being a breaking change).
Thanks to Joachim Praetorius, Lars Hupel, Martin Kühl, Niko Will, Michael Schürig, Martin Eigenbrodt and Bascht for their help and advice.
-
At the time of writing, Ruby's stable release is 2.6.5 while Debian Buster is still at 2.5.5 (which is not even the current version of the 2.5 branch). This requires installing Ruby in a current version means compiling it manually or using one of the Ruby installers. This of course is not a Ruby specific problem, Node.js is in a similar situation. ↩