Creating the Blueprint

The blueprint declares which streamlets join together to form our pipeline. Create blueprint.conf in src/main/blueprint to specify the connection as follows:

blueprint {
  streamlets {
    http-ingress = sensordata.SensorDataHttpIngress
    # file-ingress = sensordata.SensorDataFileIngress
    metrics = sensordata.SensorDataToMetrics
    validation = sensordata.MetricsValidation
    valid-logger = sensordata.ValidMetricLogger
    invalid-logger = sensordata.InvalidMetricLogger
  }

  topics {
    sensor-data {                      (1)
      producers = [http-ingress.out]   (2)
      consumers = [metrics.in]         (3)
    }
    metrics {
      producers = [metrics.out]
      consumers = [validation.in]
    }
    invalid-metrics {
      producers = [validation.invalid]
      consumers = [invalid-logger.in]
    }
    valid-metrics {
      producers = [validation.valid]
      consumers = [valid-logger.in]
    }
  }
}

Understanding the Blueprint File.

The blueprint file contains two sections: the streamlets section and the topics section.

The streamlets section

In the streamlets section of the blueprint, we declare the streamlets participating in the application definition. Each streamlet is declared by: name = <fully.qualified.name.of.the.implementing.class> where the streamlet name is a short descriptive name, which must be unique within the pipeline definition.

The topics section

In the topics section of the blueprint, we declare the Kafka topics used in this pipeline and the producers and consumers attached to it. Streamlet outlets are producers. Streamlet inlets are consumers. Consumers and producers of the same topic must use the same or compatible schemas.

Let’s take this definition as an example:

  topics {
    sensor-data {                      (1)
      producers = [http-ingress.out]   (2)
      consumers = [metrics.in]         (3)
    }
1 Declare a topic called sensor-data.
2 The streamlet http-ingress produce data to it through its .out outlet.
3 The streamlet metrics consumes from this topic through its .in inlet.

The Cloudflow build system validates that these connections are schema compatible.

Verify the Blueprint (optional)

Cloudflow allows you to verify the sanity of the blueprint before you use the application. The verification process checks for unconnected end points, schema compatibility between streamlets, invalid/non-existing streamlets, and other issues related to the structure and semantics of the application.

To verify the blueprint:

  1. From your project’s top-level directory, run sbt in interactive mode:

    sbt
  2. Enter the verify command:

    verifyBlueprint

    You should see output similar to the following:

    sbt:sensor-data> verifyBlueprint
    [info] Streamlet 'sensordata.SensorDataToMetrics' found
    [info] Streamlet 'sensordata.MetricsValidation' found
    [info] Streamlet 'sensordata.SensorDataHttpIngress' found
    [info] Streamlet 'sensordata.ValidMetricLogger' found
    [info] Streamlet 'sensordata.InvalidMetricLogger' found
    [success] /path/to/sensor-data/src/main/blueprint/blueprint.conf verified.

What’s next

With the streamlets implemented and the blueprint ready, let’s try running Wind Turbine in a development sandbox.