JobManager and TaskManagers fail with org.rocksdb.RocksDBException Not supported

Issue

Note: This applies to Flink 1.8 and later.

The JobManager and TaskManagers fail with org.rocksdb.RocksDBException: ... Not supported and an entry like the following in the logs:

2019-10-15 12:33:53,699 INFO org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Could not complete snapshot 1 for operator testOperator (1/2).
org.rocksdb.RocksDBException: while link file to /mnt/checkpoints/job/<path>/chk-1.tmp/000022.sst: /mnt/checkpoints/job/<path>/db/000022.sst: Not supported
 at org.rocksdb.Checkpoint.createCheckpoint(Native Method)
 at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51)
 at org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.takeDBNativeCheckpoint(RocksIncrementalSnapshotStrategy.java:243)
 at org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.doSnapshot(RocksIncrementalSnapshotStrategy.java:154)
 at org.apache.flink.contrib.streaming.state.snapshot.RocksDBSnapshotStrategyBase.snapshot(RocksDBSnapshotStrategyBase.java:128)
 at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.snapshot(RocksDBKeyedStateBackend.java:484)
 at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:407)
 at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1113)
 at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1055)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:729)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:641)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:586)
 at org.apache.flink.streaming.runtime.io.BarrierTracker.notifyCheckpoint(BarrierTracker.java:270)
 at org.apache.flink.streaming.runtime.io.BarrierTracker.processBarrier(BarrierTracker.java:186)
 at org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:105)
 at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:273)
 at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
 at java.lang.Thread.run(Thread.java:748)

Environment

  • Flink version: 1.8 and later.
  • Cloud: Azure
  • Azure File Share mounted at /mnt/checkpoints
  • RocksDB is used as state backend with the following settings:
state.backend: rocksdb
state.backend.incremental: 'true'
state.checkpoints.dir: 'file:///mnt/checkpoints'
state.backend.rocksdb.localdir: /mnt/checkpoints/job

Resolution

While keeping state.checkpoints.dir on a distributed file system (Azure File Share in this case), move state.backend.rocksdb.localdir to a local file system like /tmp.

Cause

When state.checkpoints.dir and state.backend.rocksdb.localdir are configured to use the same file system, RocksDB makes use of hard links for checkpointing. Azure File Share does, however, not support hard-links and thus fails.

Important: RocksDB is a local embedded database used by Flink on each TaskManager. It is not used for state persistence or fault tolerance. As such, it should always be on a local file system, preferably a locally-attached SSD.

Related Information