Building Apache Flink Streaming at Splunk - Flink Forward 2021

October 06, 2021 | by Brent Davis

Brent Davis, Principal Performance Engineer at Splunk, will deliver a technical session on  Sources, Sinks, and Operators: A Performance Deep Dive on October 27 at the upcoming Flink Forward Global 2021 conference.

The Splunk platform is designed to help IT, DevOps, and security teams transform their organizations with data from any source and on any timescale. In his presentation, Brent will share the story of how Splunk built and scaled their Flink streaming infrastructure and the lessons learned along the way. 


Brent has 22 years of experience creating and leading end-to-end performance, scalability, reliability, planning, and infrastructure efforts at massive cloud scale. We interviewed Brent about his talk, his journey with Apache Flink, and his plans for Flink Forward 2021...

FlinkForward_Banner_brent_Artboard 3 Kopie 3

What will people learn from your session at Flink Forward Global 2021?

At Splunk, we have built a Flink streaming infrastructure and scaled some very high throughput Flink jobs, streaming petabytes of data per day.  We’ve designed a streaming product that strives to be both multi-purpose (in that it enables a large variety of sources, sinks, and functions) as well as a key component of streaming in our overall data & observability story as a company.

I’ve spent lots of time profiling our sources, sinks, and operators to improve performance.  I’ll share our real life experiences in scaling for this throughput and discoveries we’ve made along the way: both in the writing of our own functions, as well as decisions in the Flink infrastructure itself.

 

How long have you been using Apache Flink?

I became interested in Flink as we were building the first version of Splunk’s Data Stream Processor - more than two years ago.  That was on Flink version 1.5, it’s been very exciting to see the platform grow and mature these last few years!

 

What’s your real-time stack? What software do you run with Apache Flink at Splunk?

We support a wide variety of sources and destinations on our streaming product:  Kinesis, S3, Kafka, Pulsar, Google Pub/Sub, GCS, Azure event hubs, and, of course, Splunk Indexes, Forwarders, HTTP Clients, and the Splunk Observability Cloud.

As for the Splunk streaming underpinnings, it consists of an architecture built on top of Apache Pulsar as a messaging backbone, Flink as the streaming engine, and a custom-built control plane and streaming REST layer for creating and interacting with the platform.  All of this is deployed in a Kubernetes cluster.

 

What do you look forward to most at Flink Forward Global 2021?

I love the depth of technical presentations at Flink Forward; it’s a firehose of interesting, relevant, and substantial information. This year I’m really looking forward to learning more about the new Stateful Functions API, and how folks are using that feature in the real world.

 

What other Flink Forward Global 2021 sessions interest you?

 

Any final thoughts?

Flink Forward is a great place to learn about the rapid advances happening in real time data engineering. There is a lot going on in this space as we evolve to higher levels of scale, performance, and ease of use. Data is our lifeblood at Splunk, and I look forward to sharing with the community what we’ve learned on our streaming journey with Flink.

 

We hope you will join Brent and the rest of the Apache Flink community at Flink Forward online on October 26-27.

Secure your spot here!

Topics: Flink Forward

Brent Davis
Article by:

Brent Davis

Related articles

Comments

Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Nico Kruber May 24, 2022

Monitoring Large-Scale Apache Flink Applications, Part 1: Concepts & Continuous Monitoring

As the original creators of Apache Flink, we are often asked for best practices around monitoring Flink applications and people want to know which metrics they should monitor for their applications...

Read More
by David Anderson February 10, 2022

Continuously Improving Apache Flink Training

Apache Flink is one of the fastest-evolving open source projects, so we’re continuously improving our Apache Flink training courses to keep pace. There are usually three significant Apache Flink...

Read More
by Stephan Ewen January 20, 2022

A Farewell Message

Today I need to share some bittersweet news: I have decided to leave Ververica and reduce my engagement in Apache Flink, to start a new endeavor. This was one of the toughest decisions of my life,...

Read More