Define Avro schema

Let’s start building the avro schema for the domain objects that we need for the application. These schema files have the extension .avsc and go directly under src/main/avro in the project structure that we discussed earlier.

In the Wind Turbine example, we will use the following domain objects:

  • SensorData: The data that we receive from the source and ingest through our ingress.

  • Measurements: A substructure of SensorData that abstracts the values of interest, such as wind speed.

  • Metric: A domain object that identifies the reported values we would like to track and measure.

  • InvalidMetric: An object that encapsulates the erroneous metric.

During the build process, the Cloudflow plugin system processes the schemas. For each of these schema files, Cloudflow will generate Scala case classes that can be directly used from within the application.

Full sources for all Cloudflow example applications can be found in the examples folder of the cloudflow project on Github. The sources for the example described below can be found in the application called sensor-data-scala.

Creating the schema files

To create the avro schema for the domain objects, follow these steps:

  1. Create a SensorData.avsc file and save it in the avro subdirectory of the example project. Use the following definition:

    {
        "namespace": "sensordata",
        "type": "record",
        "name": "SensorData",
        "fields":[
             {
                "name": "deviceId",
                "type": {
                    "type": "string",
                    "logicalType": "uuid"
                }
             },
             {
                "name": "timestamp",
                "type": {
                    "type": "long",
                    "logicalType": "timestamp-millis"
                }
             },
             {
                "name": "measurements", "type": "sensordata.Measurements"
             }
        ]
    }
  2. Create a Measurements.avsc file and save it in the avro subdirectory of the example project. Use the following definition:

    {
        "namespace": "sensordata",
        "type": "record",
        "name": "Measurements",
        "fields":[
             {
                "name": "power", "type": "double"
             },
             {
                "name": "rotorSpeed", "type": "double"
             },
             {
                "name": "windSpeed", "type": "double"
             }
        ]
    }
  3. Create a Metric.avsc file and save it in the avro subdirectory of the example project. Use the following definition:

    {
        "namespace": "sensordata",
        "type": "record",
        "name": "Metric",
        "fields":[
             {
                "name": "deviceId",
                "type": {
                    "type": "string",
                    "logicalType": "uuid"
                }
             },
             {
                "name": "timestamp",
                "type": {
                    "type": "long",
                    "logicalType": "timestamp-millis"
                }
             },
             {
                "name": "name", "type": "string"
             },
             {
                "name": "value", "type": "double"
             }
        ]
    }
  4. Create an InvalidMetric.avsc file and save it in the avro subdirectory of the example project. Use the following definition:

    {
        "namespace": "sensordata",
        "type": "record",
        "name": "InvalidMetric",
        "fields":[
             {
                "name": "metric", "type": "sensordata.Metric"
             },
             {
                "name": "error", "type": "string"
             }
        ]
    }