Why and how to leverage the simplicity and power of SQL on Flink
SQL is the lingua franca of data processing and everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flink’s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers. In our talk, we will discuss why you should and how you can (not being Alibaba or Uber) leverage the simplicity and power of SQL on Flink. We will start exploring the use cases that Flink SQL was designed for and present real-world problems that it can solve. In particular, you will learn why unified batch and stream processing is important and what it means to run SQL queries on streams of data. After we explored why you should use Flink SQL, we will show how you can leverage its full potential. Since recently, the Flink community is working on a service that integrates a query interface, (external) table catalogs, and result serving functionality for static, appending, and updating result sets. We will discuss the design and feature set of this query service and how it can be used for exploratory batch and streaming queries, ETL pipelines, and live updating query results that serve applications, such as real-time dashboards. The talk concludes with a brief demo of a client running queries against the service.
Fabian Hueske is a committer and PMC member of the Apache Flink project and has been contributing to Flink since its earliest days. Fabian works as a software engineer at
Timo Walther is a committer and PMC member of the Apache Flink project. He studied Computer Science at TU Berlin. Alongside his studies, he participated in the Database Systems and Information Management Group there and worked at IBM Germany. Timo works as a software engineer at Ververica. In Flink, he is mainly working on the Table & SQL API.