A deep dive on Change Data Capture with Flink SQL during Flink Forward

September 10, 2020 | by Jark Wu & Qingsheng Ren

Can you believe that Flink Forward Global Virtual Conference 2020 is only a few weeks away? 

We are very excited to be presenting some of the latest — and most significant — Flink developments to the wider community, including the internals of performing Change Data Capture (CDC) and real time data processing with Flink SQL. We are looking forward to welcoming you to our session, taking place October 22. Make sure to register and secure your spot today!  

This second virtual Flink Forward is packed with exciting deep dive talks, technical sessions and Apache Flink use cases showcasing not only how far Flink has come as a unified data processing framework, but where the technology is heading in the coming releases. We invite you to (virtually) join the Flink community and immerse yourself in the exciting world of stream processing, real time data analytics and event driven applications.

If you haven’t done so already, go ahead and check the full conference program to discover some exciting sessions from companies like Intel, Spotify, Alibaba, Uber and more.

Change Data Capture and Processing with Flink SQL

 

Change Data Capture and Processing with Flink SQL

Change Data Capture (CDC) has become the standard method for capturing and propagating committed changes from a database to downstream consumers, such as keeping multiple datastores in sync and avoiding common pitfalls due to, for example,  dual writes. CDC is one of the foundational data blocks when building a data warehouse, because many business-related information and data are stored in databases. Consuming such changelogs with Apache Flink used to be rather adventurous, but with the introduction of support for CDC in the latest Flink 1.11 release, developers can now implement Change Data Capture from the comfort of their SQL couch 😀😀. 

 

What we are going to cover

In our session Change Data Capture (CDC) and real time data processing with Flink SQL, we will introduce the new table source interface (FLIP-95) and discuss how it works and how it makes CDC possible. We will illustrate the advantages of using Flink SQL for CDC and the use cases that are now unlocked, such as data transfer, automatically updating caches and full-text index in sync, and finally materializing real-time aggregate views on databases. We will show how to use Flink SQL to easily process database changelog data generated with Debezium. Furthermore, we will introduce a more lightweight architecture to capture changelogs with flink-cdc-connectors and eliminate the dependencies of Debezium and Kafka service. 

With a live demo, we will show how to use Flink SQL to capture change data from upstream MySQL and PostgreSQL databases, join the change data together and stream out to ElasticSearch for indexing. The entire demo will be solely based on pure SQL without a single line of Java/Scala code. 

Lastly we will close the session with an outlook of upcoming features around Flink SQL and Change Data Capture (CDC) as well as more ecosystem connectors around this.

 

What you will learn

Through our session, you will get a clear understanding of the latest developments around Change Data Capture (CDC) with Apache Flink, and specifically FlinkSQL. With our demo, participants will experience first-hand how easy it is to capture data changes from a database with FlinkSQL.

Finally, you’ll learn some best practises around CDC and how to use Flink SQL as a powerful method to Extract, Transform and Load data (ETL) at the same time.

email banner global 2020 2

Make sure to secure your spot before October 1 by registering on the Flink Forward website. As an addition to the conference, this Flink Forward event features six instructor-led training sessions covering both beginner and advanced Flink topics such as:

  • Flink Development (2 days)

  • SQL Development (2 days)

  • Runtime & Operations (1 day)

  • Stateful Functions (1 day)

  • Tuning & Troubleshooting (introduction and advanced, 1 day each)


Grab your pass and virtually meet the Apache Flink community this October!

Flink Forward, Tech Conference, Flink, Apache Flink

Ververica Contact

 

 

 

 

Topics: Flink Forward, Flink SQL

Related articles

Comments

Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Seth Wiesman November 11, 2020

Flink Forward Global 2020 Recap

On October 19-22, 2020 we virtually welcomed more than 1,500 members of the Apache Flink® community at Flink Forward Global 2020. The second virtual Flink Forward featured several exciting keynote...

Read More
by Konstantin Knauf November 09, 2020

Ververica Platform 2.3: Getting Started with Flink SQL on Ververica Platform

Flink SQL is one of the three main programming interfaces of Apache Flink.  It is declarative and, because the community kept the syntax inline with the SQL standard, anyone who knows traditional...

Read More

How mitigating event-time skewness can reduce checkpoint failures and task manager crashes

Introduction

Support for event time and stateful event processing are some of the features that make Apache Flink stand out compared to other stream processors. However, using event time may cause...

Read More