Free Chapter: Setting Up Flink for Streaming Applications

Apache Flink is a distributed system for stateful stream processing that excels in providing accurate, low-latency, and fault-tolerant processing of large volumes of streaming data. Deploying Flink applications on-premise, in the cloud or in hybrid environments has its own set of challenges: in the chapter “Setting up Flink for Streaming Applications”, Vasia Kalavri and Fabian Hueske discuss the building blocks of getting Flink up and running for application development in the most common environments nowadays.

 

The chapter covers:

  • Setting up and deploying Flink in different environments: Standalone, Docker, Yarn, Kubernetes
  • Configuring Flink for High-Availability (HA) with Apache Zookeeper

  • Integrating Flink with components from the Hadoop ecosystem

  • Interacting with known filesystems: S3, HDFS, NFS, Swift FS

  • Tuning Flink for optimal performance and behaviour: JVM parameters & Classloading, Memory, Disk Space, Checkpointing, State Backends, Security

     
chapter book