Cloudflow Core Concepts
Cloudflow allows you to quickly build and deploy distributed stream processing applications by breaking them into smaller stream processing units called Streamlets
. Each Streamlet
represents an independent stream processing component that implements a self-contained stage of the application logic. Streamlets
let you break down your application into logical pieces that communicate with each other in a streaming fashion to accomplish an end to end goal. Streamlets
can be composed into larger systems using blueprints, which specify how Streamlets
can be connected together to form a topology.
In this document, we give you an overview of the main building blocks of a Cloudflow application.
We start with an overview of the Streamlet
concept.
Streamlets
Streamlet
s are the core building blocks of a Cloudflow application.
Each Streamlet
represents an independent stream processing component that implements a self-contained stage of the application logic.
The lightweight Streamlet
API exposes the raw power of the underlying runtime and its libraries while providing a higher-level abstraction for composing streamlets
and expressing data schemas.
Your code is written in your familiar Structured Streaming, Flink, or Akka Streams native API.
Streamlets declare inlets and outlets to define the data they consume or produce.
Inlets and outlets are schema-driven, ensuring that data flows are always consistent and that connections between Streamlets
are compatible.
The data sent between Streamlets
is safely persisted in the underlying pub-sub system, allowing for independent lifecycle management of the different components.
Streamlet Shapes
The combination of inlets and outlets give the Streamlet
its shape
.
Some examples of commonly used streamlet shapes are the following:
Ingress
An Ingress is a streamlet with zero inlets and one or more outlets. An ingress could be a server handling requests e.g. using http.
Processor
A Processor has one inlet and one outlet. Processors represent common data transformations like map and filter, or any combination of them.
Blueprint
A Blueprint connects streamlets together. This is what transforms a bunch of streamlets into an application. A blueprint is written in a file using a declarative language and is part of the project.
Application
A Cloudflow application is a blueprint
that defines a collection of Streamlets that can be deployed as a unit to a Cloudflow-enabled cluster.
A deployed application is the runtime realization of the blueprint.
The application gets formed according to the Streamlets
included and the connections specified in the blueprint, materialized as data flows between the Streamlets
.