Apache Flink is a distributed system for stateful stream processing that excels in providing accurate, low-latency, and fault-tolerant processing of large volumes of streaming data. Deploying Flink applications on-premise, in the cloud or in hybrid environments has its own set of challenges: in the chapter “Setting up Flink for Streaming Applications”, Vasia Kalavri and Fabian Hueske discuss the building blocks of getting Flink up and running for application development in the most common environments nowadays.
The chapter covers:
Configuring Flink for High-Availability (HA) with Apache Zookeeper
Integrating Flink with components from the Hadoop ecosystem
Interacting with known filesystems: S3, HDFS, NFS, Swift FS
Tuning Flink for optimal performance and behaviour: JVM parameters & Classloading, Memory, Disk Space, Checkpointing, State Backends, Security