3 important performance factors for stateful functions and operators in Flink

February 15, 2019 | by Tzu-Li (Gordon) Tai

This post focuses on the 3 factors developers should keep in mind when assessing the performance of a function or operator that uses Flink’s Keyed State in a stateful streaming application.

Keyed state is one of the two basic types of state in Apache Flink, the other being Operator state. As its name suggests, keyed state is bound to keys and is only available to functions and operators that process data from a KeyedStream. The difference between operator and keyed state is that operator state is scoped per parallel instance of an operator (sub-task), while keyed state is partitioned or sharded based on exactly one state-partition per key.

For more information about State in the Apache Flink, the documentation section “Working with State” describes how to use Flink’s state abstractions when developing an application. 



3 factors that impact the performance of
Keyed State in Apache Flink

With this in mind, let’s discuss below 3 factors that will impact the performance of Flink’s keyed state that you should keep in mind while developing stateful streaming applications:

The chosen state backend

The selected state backend has the most dominant impact on the performance of your stateful function or operation in a Flink application. The most distinctive factor here is how each state backend handles serialization of your state for persistence differently.

For example, when using the FsStateBackend or MemoryStateBackend, local state is maintained as on-heap objects during runtime and therefore has low overhead when accessing or updating them. Serialization overheads only occur when snapshots of the state are taken to create Flink checkpoints or savepoints. Downside to using this state backend is that state size is limited by heap size of the JVM, and can potentially run into OutOfMemory errors or long pauses for garbage collection.

On the contrary, out-of-core state backends such as the RocksDBStateBackend allows much larger state sizes by maintaining local state on disk. The tradeoff is that every state read and write would require serialization / deserialization.

To wrap this up, if your state size is small and expected to not exceed heap size, then using on-heap backends would be the obvious choice as it avoids serialization overheads. Otherwise, currently, RocksDBStateBackend would be the go-to choice for applications with large state sizes.

Per-key state primitives

ValueState / ListState / MapState

Another important factor is knowing to choose the correct state primitives. Flink currently supports 3 main state primitives for keyed state: ValueState, ListState, and MapState.

One common mistake new developers to Flink might make is having as state, for example, a  ValueState<Map<String, Integer>> while the map entries are intended to only be randomly accessed. In this case, it is definitely better to use MapState<String, Integer>, especially when taking into account that out-of-core state backends, such as RocksDBStateBackend, serializes/deserializes ValueState states completely on access, while for MapState, serialization occurs per-entry.

Access Pattern

Following up the previous section about state primitives, it was already quite thoroughly hinted that assessing how your application logic accesses state will help determine what state structure you should be using. As developers should always expect when designing any kind of application, using unsuitable data structures for your application’s specific data access pattern can have a severe impact on overall performance.


Developers should consider all three factors above as they can affect the performance of stateful functions and operators in Flink to a great extent. We encourage you to check our Advanced Training Schedule below for more best practices and application design patterns for stateful streaming applications.

Public Training Flink, Flink training, Ververica training, Apache Flink training

Ververica Contact





Topics: Apache Flink

Tzu-Li (Gordon) Tai
Article by:

Tzu-Li (Gordon) Tai

Find me on:

Related articles


Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Frédérique Mittelstaedt October 19, 2021

Keeping Redditors safe with Stateful Functions - Flink Forward 2021

London-based Frédérique Mittelstaedt leads the real-time safety applications team at Reddit. His team keeps Redditors safe by automating the detection and actioning of harmful user behaviour and...

Read More
by Brent Davis October 06, 2021

Building Apache Flink Streaming at Splunk - Flink Forward 2021

Brent Davis, Principal Performance Engineer at Splunk, will deliver a technical session on  Sources, Sinks, and Operators: A Performance Deep Dive on October 27 at the upcoming Flink Forward...

Read More
by Chen Qin September 21, 2021

The Apache Flink Story at Pinterest - Flink Forward Global 2021

On October 27, at the annual Apache Flink user conference, Flink Forward Global 2021, Pinterest Tech Lead, Chen Qin will deliver a keynote talk on “Sharing what we love: The Apache Flink Story at...

Read More