Let’s start with some requirements:

Just in case that a few of these requirements sound familiar to you, since source code needs to be maintained that way: Good news, you will recognize a few of our proposals.

Let’s visualize the situation:

Fig. 1: Authors maintain documents
Fig. 1: Authors maintain documents
Fig. 2: Document releases
Fig. 2: Document releases

What kind of documents?

We (Ben and Gernot) are (co-)authors and maintainers of a few documents, for example an extensive glossary of software architecture terminology [1] and a number of technical curriculae [2].

We maintain these documents (together with a group of additional authors) in English and German. Our problem is that we write and speak only these two languages, but you will see below that additional languages can be easily integrated.

Collaboration first

As software developers you will have experienced the numerous advantages of professional version control, namely git. Combined with services like Gitlab or Github, you get a rock-solid and proven platform for collaboration, including pull/merge requests (in our case: document reviews and approvals).

Therefore, we obviously maintain our documents on such a git platform.

Pull and merge requests require that differences between documents can be automatically determined, so the technical format for documents need to be plain text. A number of such formats are used in practice (see our explanatory box below). Several of these lack the babylonic features we require to process several languages automatically, which is why we decided to use AsciiDoc. AsciiDoc is open-source and provides several incredibly powerful features that will come in handy later on.


AsciiDoc HelloWorld

== Hello Asciidoc(uments)
A paragraph. Lore ipsum.

=== Heading
Of course, all kinds of text formating is easy:

* Formats like *bold*, _italics_
* Links are simple, too: https://innoq.com[hello, INNOQ].

.Code-Highlighting
[source,groovy]
10.times { println "Hello, AsciiDoc!" }

Using the Asciidoc processor (either on your favourite shell or wrapped in a build script), you get the following output from the text above:

hello-asciidoc-screenshot
hello-asciidoc-screenshot

We compiled the AsciiDoc with gradle, using the following simple build file:

plugins {
  id "org.asciidoctor.jvm.convert" version "3.3.2"
}

repositories {
   mavenCentral()
}

Split Documents into Parts

Now that we know how to create a document, let’s prepare for more complicated stuff. At first, we should modularize our document and split it into distinct parts. It’s like creating a larger software system from distinct components or modules, but for AsciiDoc documents. Luckily, AsciiDoc comes with a highly practical feature called include, which allows for modularization of documents – see the following diagram

Fig. 3: Document made up from distinct parts
Fig. 3: Document made up from distinct parts

Of course, these include directives may contain path or directory information so that you can organize your files in adequate ways.

Hey Babylon: Multiple Languages

For multiple languages, you have two different options to organize your content (explained in Fig. 4 for EN and DE, English and German):

Fig. 4: Options to organize different languages
Fig. 4: Options to organize different languages

Let’s consider an important text passage in both English and German: (we took the liberty of using the introductory paragraph of the Agile Manifesto):

We are uncovering better ways of developing
software by doing it and helping others do it.

Wir erschließen bessere Wege, Software zu entwickeln,
indem wir es selbst tun und anderen dabei helfen.

We have the two language versions next to each other, but we need to create an English-only output, without the German stuff in it.

Excursion: The C Preprocessor

A few old-generation developers might remember the days of the C programming language. Programs sometimes contained nerdy statements like the following:

#ifdef WINDOWS
  #include "stdafx.h"
#else
  #include <stdlib.h>
  #include <string.h>
  #include <stdio.h>
#endif

In C or C++, these conditional includes are quite common. Sometimes, even the behavior of the compiler is controlled via such directives. We tell you this for a reason, just read on.

But We Are Writing Documents, Not C?

If we had a similar directive, a kind of conditional compilation, for our documents, then we could for example write #ifdef ENGLISH #include page-1-EN.adoc, and omit the other languages for a moment.

The AsciiDoc processors have learned their lessons from history, and came up with a conditional include on steroids: One can include specific parts of a file, for example just the English parts. Such include statements can even be written with variables, and these variables can be set during the build process. Wow!

Fig. 5: An Overview: One build per language
Fig. 5: An Overview: One build per language

AsciiDoc performs this magic by using tags, explicitly marked parts of a document. Here is a simple example:

// tag::EN[]
=== Delicious Food
* Cheese
* Cucumber

// end::EN[]

// tag::DE[]
=== Leckeres Essen
* Käse
* Gurke

// end::DE[]

We can then tell Asciidoc to pass the tag for EN when including the file. See the following image.

Fig. 6: Include only certain parts
Fig. 6: Include only certain parts

Now our build script needs to iterate over all the desired output languages, call the Asciidoc transformer and create a distinct output for each one. The common build tools like Gradle, Maven, or make have their specific mechanisms, a detailed explanation would exceed the scope of this article. The structure of such a build script (in Gradle) looks as follows:

// some lines left out for simplicity
task renderEN(type: RenderDocumentTask, constructorArgs: [docFileName, "EN"]) {
    doLast {
        addSuffixToDocument("-en")
    }
}    
task renderDE(type: RenderDocumentTask, constructorArgs: [docFileName, "DE"]) {
    doLast {
        addSuffixToDocument("-de")
    }
}    

task buildDocs {
    group 'Documentation'
    description 'Build target ("task") for generating output in all languages '
    dependsOn "renderDE", "renderEN"
}

You find a specific task definition per language (here: EN and DE), where the generic RenderDocumentTask gets called with the filename and the language as parameters. The heavy lifting of AsciiDoc conversion is done by the AsciiDoctor Gradle plugin.

More Conditions Asciidoc offers additional options to include conditions in your documents: You can use ifeval:: or the plain old ifdef::

ifeval::["{language}" == "DE"]
:curriculum-header-title: iSAQB-Curriculum für Foundation Level
endif::[]

ifdef::debug_adoc[]
This text is only rendered if the variable `debug_adoc` is set
endif::debug_adoc[]

But let’s have a look at a more realistic example.

Configuring the output

When we started with this toolchain, we knew that we had to find a way to be able to create either a PDF file or an HTML representation of our documents. Fortunately, asciidoc allows us to do both.

PDF files

Asciidoc allows you to create a PDF theme which is used to configure the output. It allows you to configure all sorts of stuff, like a cover image, position of elements on the pages, background images, and more. You can even use variables in the theme file, which are in our case filled with language dependent text, like the date in the footer (you can have a look at our PDF theme here). All you need to do is to tell the asciidoctor task where to look for the theme, that’s it. Let’s have a look at our gradle task to generate the PDF.

class RenderCurriculumTask extends AsciidoctorTask {
    @Inject
    RenderCurriculumTask(WorkerExecutor we, String curriculumFileName, String versionDate, String language, boolean withRemarks) {
        super(we)

        ...

        outputDir = new File("./build/")
        outputOptions {
            separateOutputDirs = false
            backends 'pdf', 'html5'             <- add 'pdf' as backend
        }

        attributes = [
                ...
                'curriculumFileName': curriculumFileName,
                'pdf-stylesdir'     : '../pdf-theme/themes',           <- This is required
                'pdf-fontsdir'      : '../pdf-theme/fonts',            <- This is required
                'pdf-style'         : 'isaqb',                         <- This is required
                'stylesheet'        : '../html-theme/adoc-github.css',
                'stylesheet-dir'    : '../html-theme'
        ]
    }
}

We removed everything from the task that is not relevant for the PDF creation (you can check the full file here). You have to enable pdf as backend (line 11) and then set the name of the theme (pdf-style), the directory where to look for the fonts that are used (pdf-fontsdir), and the directory where to look for the theme (pdf-stylesdir). Why are there two more lines that don’t seem to be related to PDF? Well, glad you asked!

HTML files?

The two additional lines you see in the code snipped above can be used to also style the HTML output. Asciidoctor has a default theme that is used for HTML output. If you want to adjust the result, all you have to do is to provide a css file that contains all the magic you want for your result. Enable html as backend and tell asciidoc where to find the stylesheet (stylesheet) and where to look for images or fonts that might be referenced in the stylesheet (stylesheet-dir). You can check one of our examples below to see the PDF and HTML results.

Ok, that’s fine for a single project, but the Advanced Level has more than ten curricula, so we would have to copy the themes to each project. If we adjusted the PDF theme in one repository, how can we make sure that all other curricula also benefit from the changes?

A Family of Similar Documents

To be able to only define both the HTML theme and the PDF theme once, we moved them to separate repositories. These repositories are then linked in each curriculum repository as a submodule. This offers several advantages.

We also identified the copyright of each curriculum as a candidate for a separate submodule. It is changed every year (to add the current year to it) and has to be done in each repository. Extracting the copyright file as submodule allows us to only change one single file. Everyone who updates their curriculum also updates the submodule to the latest revision, and that’s it.

Real World Examples

The Curriculum for Software Architecture, iSAQB CPSA-F®

Worldwide courses and classes in software architecture are taught based upon the iSAQB Software Architecture Foundation curriculum, guiding thousands of developers towards their „Professional for Software Architecture“ certification, CPSA-F. Therefore, the iSAQB needs to provide versions in different languages, both in HTML and PDF formats. This curriculum consists of approximately 40 learning goals (LGs) in 5 parts, resulting in about 30 pages Every two years the iSAQB releases an updated version of the curriculum, based upon new ideas and input from the international software architecture community.

We (Ben and Gernot) belong to the core maintainers' group of this document.

Let’s dissect its structure:

This allows us to be able to change and review each single learning goal without conflicting with other learning goals of the document. We keep both the English and the German translation of a learning goal in a file, so if one language is changed, the other one is less likely to be omitted.

For translations in other languages, we added the possibility to easily upload PDF files to the repository which will be added to the next release automatically.

The Curricula of the iSAQB Advanced Level CPSA-A®

We use the same template for each advanced level module that we also described in the previous example. This ensures a clear and overarching design and structure of the documents, so that participants can navigate through the different modules at ease, always knowing where to find what. Updating the formatting is no real effort, since this is done via the submodules. Only changes to the build environment or GitHub actions require manual adjustments in each repository.

A Large Glossary

We maintain a glossary of software architecture terminology (available for free from the iSAQB), with close to a dozen authors. A few parts of this document change quite frequently (new terms are added, explanations are updated), others are highly stable (e.g. the introduction, copyright notice and authors' biographies).

We maintained this glossary in GitHub before, but we had to manually create a PDF and upload it to Leanpub. The current approach with Asciidoc and our build pipeline allows us to create a new release by creating a new git tag and pushing it to GitHub. That’s it.

Summary

You can maintain multi-lingual documents with a pragmatic, simple and free (as in open-source) toolchain, that is developer-friendly and proven in practice. Business- and other non-IT people might miss their favorite word processing tool, but the benefit of multiple languages organized along the principle one fact, one place will help you in the long run. Until then - may the power of expressive wording be with you.

Bibliography

  1. iSAQB Glossary of Software Architecture Terminology, available in the following formats: HTML, EN,PDF, EN, Leanpub  ↩

  2. iSAQB public documents  ↩