Skip to content

The Apache Flink Story at Pinterest - Flink Forward Global 2021


On October 27, at the annual Apache Flink user conference, Flink Forward Global 2021, Pinterest Tech Lead, Chen Qin will deliver a keynote talk on “Sharing what we love: The Apache Flink Story at Pinterest”. Chen has been using Apache Flink since late 2015, and he’s been attending Flink Forward conferences since 2017.

Pinterest is a visual discovery platform that helps over 478 million Pinners find and share ideas that spark inspiration. In Chen’s presentation, he will share the story of Pinterest’s journey with Apache Flink and how the use of Flink has helped transform Pinterest with new real-time experiences for Pinners.


We interviewed Chen about his talk, his experiences with Apache Flink, and with Flink Forward...

What will people learn from the story of Pinterest’s journey with Apache Flink?

“The application of the stream processing technology is evolving rapidly thanks to productChen Qin - Pinterest innovation and competition in the online space. Making good use of mature, open source stream processing frameworks like Apache Flink can accelerate production iterations and reduce cloud infrastructure costs. Our story of adoption was a mixture of education, close working relationship with key stakeholders, and abstraction, as well as mastering dev-to-prod and operation life cycle automation at scale. Once you can demonstrate production stability to internal teams, you can support mission-critical use cases reliably at scale.”

What’s your real-time stack? What software do you run with Apache Flink at Pinterest?

“Our stack features the following:

  • Apache Kafka or S3 are most popular way to access datasets
  • We built a XenonUnifiedSource/UnifiedSink to provide seamless backfill and reprocessing;
  • Built a NRTG compiler to expedite user development and keep consistent with batch implementation
  • Apache ZooKeeper - to achieve job high availability in a single Availability zone
  • A CI system on top of Spinnaker to run frequent application regression tests at scale
  • Dr. Squirrel to provide actionable insights and debugging support to application developers at scale
  • We stored Flink SQL logical tables in Hive Metastore.”

What are your favorite Flink Forward memories?

“My first time attending Flink Forward was early 2017. I was able to see many insightful talks and use case discussions. And I tasted beverages brought by the organizing committee from Deutschland.”

Club Matte

What do you look forward to most at Flink Forward Global 2021?

"Every year, my colleagues and I have to split work and attend different tracks in Flink Forward. There are many practical talks and interesting new industrial use cases. Ease-of-use topics like abstraction (e.g. Flink SQL, Stateful Functions and our compiler based approach) and unification (to reduce the burden of maintaining batch and streaming implementations) are quite intriguing."

What other Flink Forward Global 2021 sessions interest you?

Do you have any advice for first-time Flink Forward attendees?

“There are many tracks and great presentations and it's easy to get distracted. Figure out your attendance strategy (with your colleagues) to ensure coverage of talks you want to attend.”

Any final thoughts?

“The industry is undergoing a rapid evolution from the batch-only data stack to the hybridization of the real time and batch data stack. There are many challenges to building a unified and simplified offering-empowered product team that iterates fast without overspending infrastructure budget. Flink Forward is a good place to learn how to meet these challenges.”

We hope you will join Chen and the rest of the Apache Flink community at Flink Forward online on October 26-27. Secure your spot here!

Flink forward Seattle 2023

Chen Qin
Article by:

Chen Qin

Find me on:


Our Latest Blogs

How-to guide: Build Streaming ETL for MySQL and Postgres based on Flink CDC featured image
by Ververica March 14, 2023

How-to guide: Build Streaming ETL for MySQL and Postgres based on Flink CDC

This tutorial will show how to quickly build streaming ETL for MySQL and Postgres based on Flink CDC. The examples in this article will all be done using the Flink SQL CLI, requiring only SQL and no...
Read More
Generic Log-based Incremental Checkpoint featured image
by Yanfei Lei, Rui Xia, Hangxiang Yu, Yuan Mei March 07, 2023

Generic Log-based Incremental Checkpoint

Generic Log-based Incremental Checkpoint (GIC for short in this article) has become a production-ready feature since Flink 1.16 release. We previously discussed the fundamental concept and underlying...
Read More
Streaming modes of Flink-Kafka connectors featured image
by Alexey Novakov February 28, 2023

Streaming modes of Flink-Kafka connectors

This blog post will guide you through the Kafka connectors that are available in the Flink Table API. By the end of this blog post, you will have a better understanding of which connector is more...
Read More