Skip to content

Ververica Cloud, a fully-managed cloud service for stream processing!

Learn more

3 important performance factors for stateful functions and operators in Flink


by

This post focuses on the 3 factors developers should keep in mind when assessing the performance of a function or operator that uses Flink’s Keyed State in a stateful streaming application.

Keyed state is one of the two basic types of state in Apache Flink, the other being Operator state. As its name suggests, keyed state is bound to keys and is only available to functions and operators that process data from a KeyedStream. The difference between operator and keyed state is that operator state is scoped per parallel instance of an operator (sub-task), while keyed state is partitioned or sharded based on exactly one state-partition per key.

For more information about State in the Apache Flink, the documentation section “Working with State” describes how to use Flink’s state abstractions when developing an application. 

Friday-Flink-Tip-3-important-performance-factors-for-stateful-functions-and-operators-in-Flink

3 factors that impact the performance of Keyed State in Apache Flink

With this in mind, let’s discuss below 3 factors that will impact the performance of Flink’s keyed state that you should keep in mind while developing stateful streaming applications:

The chosen state backend

The selected state backend has the most dominant impact on the performance of your stateful function or operation in a Flink application. The most distinctive factor here is how each state backend handles serialization of your state for persistence differently.

For example, when using the FsStateBackend or MemoryStateBackend, local state is maintained as on-heap objects during runtime and therefore has low overhead when accessing or updating them. Serialization overheads only occur when snapshots of the state are taken to create Flink checkpoints or savepoints. Downside to using this state backend is that state size is limited by heap size of the JVM, and can potentially run into OutOfMemory errors or long pauses for garbage collection.

On the contrary, out-of-core state backends such as the RocksDBStateBackend allows much larger state sizes by maintaining local state on disk. The tradeoff is that every state read and write would require serialization / deserialization.

To wrap this up, if your state size is small and expected to not exceed heap size, then using on-heap backends would be the obvious choice as it avoids serialization overheads. Otherwise, currently, RocksDBStateBackend would be the go-to choice for applications with large state sizes.

Per-key state primitives

ValueState / ListState / MapState

Another important factor is knowing to choose the correct state primitives. Flink currently supports 3 main state primitives for keyed state: ValueState, ListState, and MapState.

One common mistake new developers to Flink might make is having as state, for example, a  ValueState<Map<String, Integer>> while the map entries are intended to only be randomly accessed. In this case, it is definitely better to use MapState<String, Integer>, especially when taking into account that out-of-core state backends, such as RocksDBStateBackend, serializes/deserializes ValueState states completely on access, while for MapState, serialization occurs per-entry.

Access Pattern

Following up the previous section about state primitives, it was already quite thoroughly hinted that assessing how your application logic accesses state will help determine what state structure you should be using. As developers should always expect when designing any kind of application, using unsuitable data structures for your application’s specific data access pattern can have a severe impact on overall performance.

Conclusion

Developers should consider all three factors above as they can affect the performance of stateful functions and operators in Flink to a great extent. We encourage you to check our Advanced Training Schedule below for more best practices and application design patterns for stateful streaming applications.

New call-to-action

New call-to-action

Ververica Academy

Topics:
Tzu-Li (Gordon) Tai
Article by:

Tzu-Li (Gordon) Tai

Find me on:

Comments

Our Latest Blogs

Ververica celebrates as Apache Paimon Graduates to Top-Level Project featured image
by Kaye Lincoln and Karin Landers 18 April 2024

Ververica celebrates as Apache Paimon Graduates to Top-Level Project

Congratulations to the Apache Software Foundation and each individual contributor on the graduation of Apache Paimon from incubation to a Top-Level Project! Apache Paimon is a data lake format that...
Read More
Q&A with Erik de Nooij: Insights into Apache Flink and the Future of Streaming Data featured image
by Kaye Lincoln 09 April 2024

Q&A with Erik de Nooij: Insights into Apache Flink and the Future of Streaming Data

Ververica is proud to host the Flink Forward conferences, uniting Apache Flink® and streaming data communities. Each year we nominate a Program Chair to select a broad range of Program Committee...
Read More
Ververica donates Flink CDC - Empowering Real-Time Data Integration for the Community featured image
by Ververica 03 April 2024

Ververica donates Flink CDC - Empowering Real-Time Data Integration for the Community

Ververica has officially donated Flink Change Data Capture (CDC) to the Apache Software Foundation. In this blog, we’ll explore the significance of this milestone, and how it positions Flink CDC as a...
Read More