Skip to content

Ververica Cloud, a fully-managed cloud service for stream processing!

Learn more

Building Apache Flink Streaming at Splunk - Flink Forward 2021


by

Brent Davis, Principal Performance Engineer at Splunk, will deliver a technical session on Sources, Sinks, and Operators: A Performance Deep Dive on October 27 at the upcoming Flink Forward Global 2021 conference.

The Splunk platform is designed to help IT, DevOps, and security teams transform their organizations with data from any source and on any timescale. In his presentation, Brent will share the story of how Splunk built and scaled their Flink streaming infrastructure and the lessons learned along the way.

Brent has 22 years of experience creating and leading end-to-end performance, scalability, reliability, planning, and infrastructure efforts at massive cloud scale. We interviewed Brent about his talk, his journey with Apache Flink, and his plans for Flink Forward 2021...

Sources, Sinks, and Operators: A Performance Deep Dive

What will people learn from your session at Flink Forward Global 2021?

At Splunk, we have built a Flink streaming infrastructure and scaled some very high throughput Flink jobs, streaming petabytes of data per day. We’ve designed a streaming product that strives to be both multi-purpose (in that it enables a large variety of sources, sinks, and functions) as well as a key component of streaming in our overall data & observability story as a company.

I’ve spent lots of time profiling our sources, sinks, and operators to improve performance. I’ll share our real life experiences in scaling for this throughput and discoveries we’ve made along the way: both in the writing of our own functions, as well as decisions in the Flink infrastructure itself.

How long have you been using Apache Flink?

I became interested in Flink as we were building the first version of Splunk’s Data Stream Processor - more than two years ago. That was on Flink version 1.5, it’s been very exciting to see the platform grow and mature these last few years!

What’s your real-time stack? What software do you run with Apache Flink at Splunk?

We support a wide variety of sources and destinations on our streaming product: Kinesis, S3, Kafka, Pulsar, Google Pub/Sub, GCS, Azure event hubs, and, of course, Splunk Indexes, Forwarders, HTTP Clients, and the Splunk Observability Cloud.

As for the Splunk streaming underpinnings, it consists of an architecture built on top of Apache Pulsar as a messaging backbone, Flink as the streaming engine, and a custom-built control plane and streaming REST layer for creating and interacting with the platform. All of this is deployed in a Kubernetes cluster.

What do you look forward to most at Flink Forward Global 2021?

I love the depth of technical presentations at Flink Forward; it’s a firehose of interesting, relevant, and substantial information. This year I’m really looking forward to learning more about the new Stateful Functions API, and how folks are using that feature in the real world.

What other Flink Forward Global 2021 sessions interest you?

Any final thoughts?

Flink Forward is a great place to learn about the rapid advances happening in real time data engineering. There is a lot going on in this space as we evolve to higher levels of scale, performance, and ease of use. Data is our lifeblood at Splunk, and I look forward to sharing with the community what we’ve learned on our streaming journey with Flink.

We hope you will join Brent and the rest of the Apache Flink community at Flink Forward online on October 26-27.

Secure your spot here!

New call-to-action

Topics:
Brent Davis
Article by:

Brent Davis

Comments

Our Latest Blogs

Driving Real-Time Data Solutions: Insights from Uber's Na Yang featured image
by Kaye Lincoln 23 April 2024

Driving Real-Time Data Solutions: Insights from Uber's Na Yang

As the organizers of Flink Forward, at Ververica we take great pride in bringing together the Apache Flink® and streaming data communities. Every year, we appoint a Program Chair responsible for...
Read More
Ververica celebrates as Apache Paimon Graduates to Top-Level Project featured image
by Kaye Lincoln and Karin Landers 18 April 2024

Ververica celebrates as Apache Paimon Graduates to Top-Level Project

Congratulations to the Apache Software Foundation and each individual contributor on the graduation of Apache Paimon from incubation to a Top-Level Project! Apache Paimon is a data lake format that...
Read More
Q&A with Erik de Nooij: Insights into Apache Flink and the Future of Streaming Data featured image
by Kaye Lincoln 06 April 2024

Q&A with Erik de Nooij: Insights into Apache Flink and the Future of Streaming Data

Ververica is proud to host the Flink Forward conferences, uniting Apache Flink® and streaming data communities. Each year we nominate a Program Chair to select a broad range of Program Committee...
Read More