Streaming Data Pipelines on Kubernetes

Cloudflow enables you to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes. With Cloudflow, streaming applications are comprised of small composable components wired together with schema-based contracts. Cloudflow can dramatically accelerate streaming application development—​reducing the time required to create, package, and deploy—​from weeks to hours.

With its powerful abstractions, Cloudflow allows you to easily define the most complex streaming applications:

  • Develop: Focus only on business logic, leave the boilerplate to us.

  • Build: We provide a rich development kit for going from business logic to a deployable application.

  • Deploy: Cloudflow comes with Kubernetes extensions for the cluster and the client to deploy your distributed system with a single command.

Cloudflow integrates with popular streaming engines like Akka, Spark and Flink. It also comes with a powerful CLI tool to easily manage, scale and configure streaming applications at runtime. In a nutshell, Cloudflow is an application development toolkit composed of:

  • A definition for Streamlet, the core component model in Cloudflow.

  • An extensible runtime for Streamlet(s) and out-of-the-box implementations for popular streaming runtimes, like Spark’s Structured Streaming, Flink, and Akka.

  • A Streamlet composition model driven by a blueprint definition.

  • A sandbox local execution mode that accelerates the development and testing of your applications without requiring a cluster.

  • A set of sbt plugins that are able to package your application into deployable units.

  • The Cloudflow operator, a Kubernetes operator that manages the application lifecycle on Kubernetes.

  • A CLI, in the form of a kubectl plugin, that facilitates manual, scripted, and automated (CI/CD) management of the application.