How to choose a state backend for a Flink job

Question

Which state backend should I choose for my Flink job?

Answer

Note: This applies to Flink 1.8 and later.

Flink 1.12 or earlier

Out of the box, Flink bundles three state backends:

  • MemoryStateBackend (Default)
  • FsStateBackend
  • RocksDBStateBackend

The following table shows an overview of these state backends focusing on where in-flight data lives and where the state is persisted during snapshotting (checkpoints and savepoints). It also summarizes the use cases they are good for, which should help you choose the proper state backend for your job.

Name In-Flight data Snapshots

MemoryStateBackend

JVM Heap JobManager JVM Heap
  • Good for testing and experimentation with small states
  • Not for production use!

FsStateBackend

JVM Heap Distributed File System
  • Fast, requires a large heap
  • Subject to GC

RocksDBStateBackend

Local disk (tmp dir) with
Off-Heap/Native Buffers
Distributed File System
  • Supports state larger than available memory
  • Supports incremental snapshotting
  • Rule of thumb: 10x slower than heap-based backends

To set the state backend in flink-conf.yaml, use the key state.backend and set its value to jobmanager, filesystem, or rocksdb.

Flink 1.13 or later

To separate the in-flight state storage and the checkpoint storage explicitly, Flink 1.13 and later bundle two state backends:

  • HashMapStateBackend (Default)
  • EmbeddedRocksDBStateBackend

which stores the in-flight state in the JVM heap or RocksDB, respectively. You can use these state backends with different checkpoint storage independently, e.g., JobManagerCheckpointStorage or FileSystemCheckpointStorage. To set in flink-conf.yaml, use

state.backend: hashmap (or rocksdb)
state.checkpoint-storage: filesystem (or jobmanager)

# if specified, implies 'filesystem' checkpoint-storage
state.checkpoints.dir: file:///checkpoint-dir/ 

Starting Flink 1.13, one can switch from one state backend to another one by first making a job savepoint and then restoring it with a different state backend. Therefore, a user can try first to use the "hashmap" backend in production and then maybe later switch to "rocksdb" to benefit from its incremental checkpointing, which is not yet supported in the "hashmap" backend as of May 2023.

Related Information