Dieser Artikel ist auch auf Deutsch verfügbar
I can still remember how at the beginning of my career in software development XML was everywhere. It was used both for configuration files and for text-based exchange formats. Not least as a result of the fact that SOAP and also the browser used XML via AJAX, the use of XML was unavoidable. Already then, XML often had the reputation of being rather rambling and, above all in combination with schemas, also quite complicated. Over the years it was slowly displaced by JSON. Nowadays, at least in my work, XML is no longer relevant.
If however we want to generate or process JSON in Java, we quickly learn that JDK itself does not offer a programming interface for this purpose. The main reason for this is that the limited capacity of the JDK team is not to be overloaded with additional APIs. We therefore have to look elsewhere for a suitable library. As expected, a quick search reveals not just one result but a whole host of libraries to choose from.
This article focuses on the programming models of four different libraries for Java: org.json, Gson, and Jackson as well as JSON-P and JSON-B from the Jakarta EE world. At the end we will also take a quick look at the aspects of performance and security.
org.json
The library org.json exists already since the end of 2010 and was initially implemented by Douglas Crockford, the creator of JSON. One can therefore consider it the reference implementation for JSON in Java.
On balance it is an easy-to-use programming interface that at its core consists of the two classes JSONObject
and JSONArray
. These map the elements defined in the JSON specification which are not already covered by classes available in Java. These two classes and their constructors and methods are sufficient to programmatically generate JSON. Listing 1 shows the construction of a JSON object with various values. Because the put
method returns itself, a compact declaration results from the concatenation of the method calls.
The parsing of JSON is equally simple. Here we can provide the constructor of JSONObject
or JSONArray
with an instance of a JSONTokener
. This in turn can be generated with a String
, Reader
, or InputStream
. Listing 2 shows how we can parse the JSON structure from Listing 1 from a string.
There are two ways to write JSON. The first is used when we already have a JSONObject
or JSONArray
. Here we use, as can be seen in Listing 3, the write
method, which we can provide with a Writer
and optionally also an indentation factor. Alternatively, we can directly output JSON by means of JSONWriter
without having to generate objects first. In Listing 4 we write the already known structure directly to the default output of our process.
In order to work in code with a JSONObject
or JSONArray
, a range of methods are available to us. For example, we can use has
or isNull
to check whether a field exists and is not null
. Although isNull
also returns true
for fields that do not exist.
In order to query individual field values, we can choose from an array of getXxx
methods that return the value in the required Java data type. If in doing so we query a nonexistent field, a JSONException
is thrown. In parallel we can therefore use one of the optXxx
methods. These do not throw any exceptions but only standard values. Listing 5 shows a few examples for the use of these methods.
In general the methods can be used as one would expect, but I was however surprised in one or two places. For example, the query via getString
delivers an exception if the value is a number. The same field queried with optString
however returns the value of the number as a string. Vice versa, with both getInt
and optInt
with a string field, the string is parsed into a number. And the empty string selected as standard value for optString
takes some getting used to.
Recently, JSON-Pointer has also been supported for queries. Similarly to XPath for XML, this offers us the possibility to extract values from an object or array with a single expression. Two examples of this can be seen in Listing 6.
Gson
Gson from Google has existed even longer than org.json, namely since 2008. Similarly to org.json, Gson allows the reading, creation, and writing of “generic” JSON objects. The mapping of the types from the JSON specification takes place with the classes JsonObject
, JsonArray
, JsonPrimitive
, and JsonNull
, which are all inherited from JsonElement
. In everyday use JsonPrimitive
and JsonNull
are however generally not of relevance, as here we can use the Java primitives and Gson then only converts in these classes internally.
When creating (see Listing 7), it can immediately be seen that we don’t have the option here of directly concatenating the methods, as the methods add
and addProperty
have void
as return type.
In order to parse any present JSON into a JsonElement
, we use the class JsonParser
. Gson however also offers us a streaming-based solution by means of JsonReader
. In this approach we have to jump from token to token ourselves. This option is especially advantageous for the processing of very large data volumes that we don’t want to completely read in the memory. Both options can be seen in Listing 8.
For the writing of JSON, we can either use JsonWriter
to write JSON directly upon generation or serialize our JsonElement
by means of the class Gson
.
Gson, in contrast to org.json, does not use JSON Pointer, but uses data binding instead. Using data binding and the class Gson
we can bind JSON to existing Java objects and continue to write them as JSON. In Listing 9 it can be seen how we first create an instance of our Test
class from JSON and then continue to write in JSON.
For this purpose, Gson builds on the fact that the utilized class has a default constructor and then uses reflection to find all fields of the class and the parent class. However, Gson does not yet offer support for records introduced with JDK 16.
In addition, by means of TypeAdapter
Gson allows us to define our own mapping logic for types. For example, by default it is also possible to use java.net.URL
, which is mapped to a JSON string. Using annotations, it is also possible to use a name in JSON that is different from the Java field name or to limit the binding to specific fields.
Gson can also support us with the evolution of our data formats. In-built support for versioning is provided for this purpose. We annotate fields or classes with @Since
and/or @Until
and tag them with a version. Upon generation of the Gson
class we can then state which version this instance supports. When reading or writing JSON, Gson then only analyzes the fields that are supported in the specified version.
Jackson
Jackson, like Gson, exists since 2008. As far as I can tell it is currently the most commonly used library for the processing of JSON with Java. This is mainly because it is set as default in Spring Boot.
Although at its core Jackson has a streaming-based programming interface including JSON implementation, this is generally not used. Instead, Jackson stands out for its comprehensive and configurable data binding. Listing 10 shows only a snippet of the possibilities.
As can be seen in the listing, Jackson already supports records and can also deal with inheritance. In order to influence the mapping, a host of annotations are available to us. Alongside @JsonProperty
shown in the listing, there is for example also @JsonFormat
, used to specify the format for a date. With the @JsonView
annotation we furthermore have the option, when writing, of excluding different fields of the object depending on the use case without having to create new classes for the serialization for each combination.
Alongside the configuration of the mapping, Jackson itself can be configured in many ways. For example, it can be specified whether fields with a null
value are written or omitted, or we can state that in lists in a Java model only one object or value can be written in JSON. Moreover, we can add our own data types to Jackson.
In addition to the core of Jackson there are also a host of additional modules, which broadly speaking can be separated into two fields. On the one hand there are ready-made modules that provide support for additional data types, such as Eclipse Collections or Joda-Time. On the other hand there are the modules that deal above all with different data formats. As Jackson at its core is generic and independent of concrete formats, we can use it for the data binding of formats such as XML, YAML, or Protobuf.
JSON-P and JSON-B
It goes without saying that the world of Jakarta EE also offers support for JSON. Here there is, as usual, a specification that can then be implemented by different libraries.
Similarly to Jackson, the JSON support is separated into two parts. With JSON-P there is a specification that only concerns itself with the reading and writing of JSON. Building on this, JSON-B offers support for data binding. The work of JSON-P (see Listing 11) is reminiscent of org.json and Gson. Even JSON Pointer is supported (see Listing 12), as in org.json.
A particular feature of JSON-P is that it also provides support for JSON Patch. JSON Patch allows us to define operations in JSON that can then be applied to and change a JSON object. Listing 13 shows how we can create such a patch using Builder in Java and apply it to an object.
The data binding with JSON-B is the same as in Jackson. We use our classes and can use annotations to adjust individual aspects such as field names. We of course also have the option here of adding adapters for our own types.
As with Gson, records are currently not supported by JSON-B by default. But in only a few simple steps JSON-BRecords we can ensure that these can in fact be used.
Performance
Alongside the programming model and interface of a library, the performance is also often important for the processing of JSON. We must of course gauge for ourselves whether the achieved performance of a library is sufficient for our specific problem or not.
If we don’t want to measure this ourselves, the Java JSON Benchmark can provide us with an initial impression. Already at first glance it can be seen that big differences can occur depending on the choice of test data and whether we are looking at reading or writing.
With regard to this benchmark, Jackson just has the edge over the other three candidates, without having to add an additional module in the form of Jackson Afterburner to increase the performance. The JSON-B reference implementation Yasson in contrast is significantly better placed with writing than with reading.
Of the two significantly smaller libraries org.json and Gson, Gson has a small advantage. But both lie significantly behind Jackson, at least in this benchmark, which surprised me.
The by some margin fastest library in this benchmark, dsljson, relies in contrast to our four libraries on code generation by means of Java annotation processing. Not having to use reflection makes a big difference on the runtime.
Security
The topic of security is particularly important for the reading of JSON. We mostly use the JSON libraries for processing incoming data, for example in a HTTP API. As we generally do not have complete control over the clients here, there is a high risk of attack.
A possible attack vector is the occurrence of a denial of service. Very large or deeply nested JSON objects mean that the parsing takes so long or needs so much memory that the application can no longer respond.
The other possibility involves attempting to force a subclass during data binding with inheritance for an object. This subclass is then used to execute malicious code. The article by Brian Vermeer provides a better understanding of this type of attack.
For both of these attack vectors we have to get to grips not only with the libraries themselves but also their selected configurations. If for example for reading we never use objects with inheritance, we can turn off the respective feature and thus effectively preclude the second attack vector.
In addition to secure configuration, which all of the libraries considered in this article should have by default, it is also important to regularly update the libraries to the latest version.