Skip to content

Building Apache Flink Streaming at Splunk - Flink Forward 2021


by

Brent Davis, Principal Performance Engineer at Splunk, will deliver a technical session on  Sources, Sinks, and Operators: A Performance Deep Dive on October 27 at the upcoming Flink Forward Global 2021 conference.

The Splunk platform is designed to help IT, DevOps, and security teams transform their organizations with data from any source and on any timescale. In his presentation, Brent will share the story of how Splunk built and scaled their Flink streaming infrastructure and the lessons learned along the way. 


Brent has 22 years of experience creating and leading end-to-end performance, scalability, reliability, planning, and infrastructure efforts at massive cloud scale. We interviewed Brent about his talk, his journey with Apache Flink, and his plans for Flink Forward 2021...

FlinkForward_Banner_brent_Artboard 3 Kopie 3

What will people learn from your session at Flink Forward Global 2021?

At Splunk, we have built a Flink streaming infrastructure and scaled some very high throughput Flink jobs, streaming petabytes of data per day.  We’ve designed a streaming product that strives to be both multi-purpose (in that it enables a large variety of sources, sinks, and functions) as well as a key component of streaming in our overall data & observability story as a company.

I’ve spent lots of time profiling our sources, sinks, and operators to improve performance.  I’ll share our real life experiences in scaling for this throughput and discoveries we’ve made along the way: both in the writing of our own functions, as well as decisions in the Flink infrastructure itself.

 

How long have you been using Apache Flink?

I became interested in Flink as we were building the first version of Splunk’s Data Stream Processor - more than two years ago.  That was on Flink version 1.5, it’s been very exciting to see the platform grow and mature these last few years!

 

What’s your real-time stack? What software do you run with Apache Flink at Splunk?

We support a wide variety of sources and destinations on our streaming product:  Kinesis, S3, Kafka, Pulsar, Google Pub/Sub, GCS, Azure event hubs, and, of course, Splunk Indexes, Forwarders, HTTP Clients, and the Splunk Observability Cloud.

As for the Splunk streaming underpinnings, it consists of an architecture built on top of Apache Pulsar as a messaging backbone, Flink as the streaming engine, and a custom-built control plane and streaming REST layer for creating and interacting with the platform.  All of this is deployed in a Kubernetes cluster.

 

What do you look forward to most at Flink Forward Global 2021?

I love the depth of technical presentations at Flink Forward; it’s a firehose of interesting, relevant, and substantial information. This year I’m really looking forward to learning more about the new Stateful Functions API, and how folks are using that feature in the real world.

 

What other Flink Forward Global 2021 sessions interest you?

 

Any final thoughts?

Flink Forward is a great place to learn about the rapid advances happening in real time data engineering. There is a lot going on in this space as we evolve to higher levels of scale, performance, and ease of use. Data is our lifeblood at Splunk, and I look forward to sharing with the community what we’ve learned on our streaming journey with Flink.

 

We hope you will join Brent and the rest of the Apache Flink community at Flink Forward online on October 26-27.

Secure your spot here!

Topics:
Brent Davis
Article by:

Brent Davis

Comments

Our Latest Blogs

The Release of Flink CDC v2.3 featured image
by Hang Ruan & Qingsheng Ren November 30, 2022

The Release of Flink CDC v2.3

Flink CDC is a change data capture (CDC) technology based on database changelogs. It is a data integration framework that supports reading database snapshots and smoothly switching to reading binlogs...
Read More
Flink SQL Recipe: Window Top-N and Continuous Top-N featured image
by Ververica November 25, 2022

Flink SQL Recipe: Window Top-N and Continuous Top-N

Flink SQL has emerged as the standard for low-code streaming analytics and managed to unify batch and stream processing while simultaneously staying true to the SQL standard. In addition, it provides...
Read More
Apache Flink SQL: Past, Present, and Future featured image
by Becket Qin November 22, 2022

Apache Flink SQL: Past, Present, and Future

Recently the Apache Flink community announced the release of Flink 1.16, which continues to push the vision of stream and batch unification in Flink SQL to a new level. At this point, Flink SQL is...
Read More