The Apache Flink Story at Pinterest - Flink Forward Global 2021

September 21, 2021 | by Chen Qin

On October 27, at the annual Apache Flink user conference, Flink Forward Global 2021, Pinterest Tech Lead, Chen Qin will deliver a keynote talk on “Sharing what we love: The Apache Flink Story at Pinterest”. Chen has been using Apache Flink since late 2015, and he’s been attending Flink Forward conferences since 2017.

Pinterest is a visual discovery platform that helps over 478 million Pinners find and share ideas that spark inspiration. In Chen’s presentation, he will share the story of Pinterest’s journey with Apache Flink and how the use of Flink has helped transform Pinterest with new real-time experiences for Pinners.

FlinkForward_Banner_KEYNOTE_Pinterest-ChenQin

We interviewed Chen about his talk, his experiences with Apache Flink, and with Flink Forward...

What will people learn from the story of Pinterest’s journey with Apache Flink?

“The application of the stream processing technology is evolving rapidly thanks to productChen Qin - Pinterest innovation and competition in the online space. Making good use of mature, open source stream processing frameworks like Apache Flink can accelerate production iterations and reduce cloud infrastructure costs. Our story of adoption was a mixture of education, close working relationship with key stakeholders, and abstraction, as well as mastering dev-to-prod and operation life cycle automation at scale. Once you can demonstrate production stability to internal teams, you can support mission-critical use cases reliably at scale.”


What’s your real-time stack? What software do you run with Apache Flink at Pinterest?

“Our stack features the following:

  • Apache Kafka or S3 are most popular way to access datasets 
  • We built a XenonUnifiedSource/UnifiedSink to provide seamless backfill and reprocessing; 
  • Built a NRTG compiler to expedite user development and keep consistent with batch implementation
  • Apache ZooKeeper - to achieve job high availability in a single Availability zone
  • A CI system on top of Spinnaker to run frequent application regression tests at scale 
  • Dr. Squirrel to provide actionable insights and debugging support to application developers at scale
  • We stored Flink SQL logical tables in Hive Metastore.”

 

Club Matte

What are your favorite Flink Forward memories?

“My first time attending Flink Forward was early 2017. I was able to see many insightful talks and use case discussions. And I tasted beverages brought by the organizing committee from Deutschland.

What do you look forward to most at Flink Forward Global 2021?

"Every year, my colleagues and I have to split work and attend different tracks in Flink Forward. There are many practical talks and interesting new industrial use cases. Ease-of-use topics like abstraction (e.g. Flink SQL, Stateful Functions and our compiler based approach) and unification (to reduce the burden of maintaining batch and streaming implementations) are quite intriguing."

 

What other Flink Forward Global 2021 sessions interest you?

 

Do you have any advice for first-time Flink Forward attendees?

“There are many tracks and great presentations and it's easy to get distracted. Figure out your attendance strategy (with your colleagues) to ensure coverage of talks you want to attend.”

 

Any final thoughts?

“The industry is undergoing a rapid evolution from the batch-only data stack to the hybridization of the real time and batch data stack. There are many challenges to building a unified and simplified offering-empowered product team that iterates fast without overspending infrastructure budget. Flink Forward is a good place to learn how to meet these challenges.”

 

We hope you will join Chen and the rest of the Apache Flink community at Flink Forward online on October 26-27. Secure your spot here!

Topics: Flink Forward

Chen Qin
Article by:

Chen Qin

Find me on:

Related articles

Comments

Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Nico Kruber May 24, 2022

Monitoring Large-Scale Apache Flink Applications, Part 1: Concepts & Continuous Monitoring

As the original creators of Apache Flink, we are often asked for best practices around monitoring Flink applications and people want to know which metrics they should monitor for their applications...

Read More
by David Anderson February 10, 2022

Continuously Improving Apache Flink Training

Apache Flink is one of the fastest-evolving open source projects, so we’re continuously improving our Apache Flink training courses to keep pace. There are usually three significant Apache Flink...

Read More
by Stephan Ewen January 20, 2022

A Farewell Message

Today I need to share some bittersweet news: I have decided to leave Ververica and reduce my engagement in Apache Flink, to start a new endeavor. This was one of the toughest decisions of my life,...

Read More