Thursday, January 27, 2022

Java: Validating JSON against an AVRO schema

AVRO schema are often used to serialize JSON data into a compact binary format in order to for example transport it efficiently over Kafka. When you want to validate your JSON against an AVRO schema in Java, you will encounter a challenge. The JSON which is required to allow validation against an AVRO schema from the Apache AVRO libraries is not standard JSON. It requires explicit typing of fields. Also when the validation fails, you will get errors like: "Expected start-union. Got VALUE_STRING" or "Expected start-union. Got VALUE_NUMBER_INT" without a specific object, line number or indication  of what is expected. Especially during development, this is insufficient.

In this blog post I'll describe a method (inspired by this) on how you can check your JSON against an AVRO schema and get usable validation results. First you generate Java classes of your AVRO schema using the Apache AVRO Maven plugin (which is configured differently than documented). Next you serialize a JSON object against these classes using libraries from the Jackson project. During serialization, you will get clear exceptions. See my sample code here.

Monday, January 24, 2022

Generating random JSON data from an AVRO schema in Java

Recently I was designing an AVRO schema and wanted to test how data would look like which conformed to this schema. I developed some Java code to generate sample data. This of course also has uses in more elaborate tests which require generation of random events. Because AVRO is not that specific, this is mainly useful to get an idea of the structure of a JSON which conforms to the definition. Here I'll describes a simple Java (17 but will also work on 11) based solution to do this.

Monday, January 10, 2022

Apache NiFi: Forwarding HTTP headers

Apache NiFi can be used to expose various flavors of webservices. Using NiFi in such a way provides benefits like quick development using a GUI and of course data provenance. You know who called you with which data and where the data went. The NiFi is very scalable, delivery can be guaranteed and NiFi can help with features like back-pressure if a backend system cannot handle requests as quickly as they are offered. Exposing webservices by using NiFi, can have additional benefits such as service virtualization (decoupling). When exposing HTTP(S) webservices, a regular requirement is to pass through HTTP headers. This blog post is about how you can do that using the NiFi processors ListenHTTP, InvokeHTTP, HandleHttpRequest and HandleHttpResponse. I've used the environment which is described here.