Skip to content

Ververica Cloud, a fully-managed cloud service for stream processing!

Learn more

Announcing Google Cloud Dataflow on Flink and easy Flink deployment on Google Cloud


Today, we are pleased to announce a deeper engagement between Google, data Artisans, and the broader Apache Flink™ community to bring easy Flink deployment to Google Cloud Platform, and enable Google Cloud Dataflow users to leverage Apache Flink™ as a backend.

Flink deployment on Google Cloud Platform

We recently contributed a patch to bdutil, Google's open source tool for deploying data processing systems on Google Compute Engine. In addition to managing Hadoop on Google Compute Engine, bdutil now lets you deploy Flink as easily as:

bdutil -e extensions/flink/ deploy

See here for detailed instructions. Automatic Flink deployment on Google Caompute Engine is a natural next step after our recent experience of using Flink and the Google Compute Engine to factorize a 28-billion element matrix in 5 hours using a 40-node cluster. Check out our recent blog post here and an extended version here.

Google Cloud Dataflow on Flink

Google Cloud Dataflow is a data analytics service running on Google’s infrastructure. It allows users to write sophisticated data analytics pipelines for both batch and streaming programs and run them at scale on Google Cloud Platform. Dataflow offers a unified view at batch and stream processing, as well as highly flexible window semantics that support complex event stream analysis patterns. Cloud Dataflow is a descendant of Google’s FlumeJava and MillWheel projects. Google recently released an SDK for Dataflow as open source. The SDK decouples the programming model from the execution engine, via pluggable "runners". Google provides runners to run Dataflow programs on Google Cloud Platform, or on a local machine (for development). Today, we are pleased to announce a Flink runner for Cloud Dataflow. Dataflow users can now run their programs using Apache Flink™ as the execution backend. The current Flink runner supports all the batch functionality of Dataflow. We are currently working on bringing the Dataflow streaming functionality into the Flink runner. Fortunately, Flink already supports flexible window semantics, as does Cloud Dataflow. Flink and Cloud Dataflow are very well aligned, as they both share the vision of natively unifying stream and batch processing at the engine level. Flink has always executed both batch and streaming programs using the same streaming (pipelined) engine. The addition of Flink to the family of Dataflow SDK runners (that now include Google’s cloud platform, a local runner, and a Cloudera-contributed Apache Spark runner) is great for users that want to run the same hybrid analytical pipelines in the cloud and even on premise. Click here to get started on Google Dataflow. To install the Flink Dataflow runner, follow the instructions here. As always, we would love to know what you think, so please give us feedback by submitting an issue. For more information, see the announcement on the Google Cloud Platform Blog.

Ververica Academy

Maximilian Michels
Article by:

Maximilian Michels


Our Latest Blogs

Streamhouse Unveiled featured image
by Jing Ge 17 November 2023

Streamhouse Unveiled

Apache Flink: History of Reliability Every year, Apache Flink® sets new records in its development journey. Standing as a testament to its growing popularity, Flink now boosts over 1.6k contributors,...
Read More
Highlights from Flink Forward Seattle 2023 featured image
by Karin Landers 16 November 2023

Highlights from Flink Forward Seattle 2023

Now that Flink Forward Seattle 2023 is over, we’re excited to share that the event was a big (Flinking) success! Flink Forward is *the* conference dedicated entirely to Apache Flink®, and includes...
Read More
Join me at Flink Forward! featured image
by Jing Ge 12 October 2023

Join me at Flink Forward!

Real-time data streaming is a HOT topic, which comes as no surprise considering how stream processing helps companies create new business opportunities, strengthens their advantage over competitors,...
Read More