A deep dive on Change Data Capture with Flink SQL during Flink Forward

September 10, 2020 | by Jark Wu & Qingsheng Ren

Can you believe that Flink Forward Global Virtual Conference 2020 is only a few weeks away? 

We are very excited to be presenting some of the latest — and most significant — Flink developments to the wider community, including the internals of performing Change Data Capture (CDC) and real time data processing with Flink SQL. We are looking forward to welcoming you to our session, taking place October 22. Make sure to register and secure your spot today!  

This second virtual Flink Forward is packed with exciting deep dive talks, technical sessions and Apache Flink use cases showcasing not only how far Flink has come as a unified data processing framework, but where the technology is heading in the coming releases. We invite you to (virtually) join the Flink community and immerse yourself in the exciting world of stream processing, real time data analytics and event driven applications.

If you haven’t done so already, go ahead and check the full conference program to discover some exciting sessions from companies like Intel, Spotify, Alibaba, Uber and more.

Change Data Capture and Processing with Flink SQL

 

Change Data Capture and Processing with Flink SQL

Change Data Capture (CDC) has become the standard method for capturing and propagating committed changes from a database to downstream consumers, such as keeping multiple datastores in sync and avoiding common pitfalls due to, for example,  dual writes. CDC is one of the foundational data blocks when building a data warehouse, because many business-related information and data are stored in databases. Consuming such changelogs with Apache Flink used to be rather adventurous, but with the introduction of support for CDC in the latest Flink 1.11 release, developers can now implement Change Data Capture from the comfort of their SQL couch 😀😀. 

 

What we are going to cover

In our session Change Data Capture (CDC) and real time data processing with Flink SQL, we will introduce the new table source interface (FLIP-95) and discuss how it works and how it makes CDC possible. We will illustrate the advantages of using Flink SQL for CDC and the use cases that are now unlocked, such as data transfer, automatically updating caches and full-text index in sync, and finally materializing real-time aggregate views on databases. We will show how to use Flink SQL to easily process database changelog data generated with Debezium. Furthermore, we will introduce a more lightweight architecture to capture changelogs with flink-cdc-connectors and eliminate the dependencies of Debezium and Kafka service. 

With a live demo, we will show how to use Flink SQL to capture change data from upstream MySQL and PostgreSQL databases, join the change data together and stream out to ElasticSearch for indexing. The entire demo will be solely based on pure SQL without a single line of Java/Scala code. 

Lastly we will close the session with an outlook of upcoming features around Flink SQL and Change Data Capture (CDC) as well as more ecosystem connectors around this.

 

What you will learn

Through our session, you will get a clear understanding of the latest developments around Change Data Capture (CDC) with Apache Flink, and specifically FlinkSQL. With our demo, participants will experience first-hand how easy it is to capture data changes from a database with FlinkSQL.

Finally, you’ll learn some best practises around CDC and how to use Flink SQL as a powerful method to Extract, Transform and Load data (ETL) at the same time.

email banner global 2020 2

Make sure to secure your spot before October 1 by registering on the Flink Forward website. As an addition to the conference, this Flink Forward event features six instructor-led training sessions covering both beginner and advanced Flink topics such as:

  • Flink Development (2 days)

  • SQL Development (2 days)

  • Runtime & Operations (1 day)

  • Stateful Functions (1 day)

  • Tuning & Troubleshooting (introduction and advanced, 1 day each)


Grab your pass and virtually meet the Apache Flink community this October!

Flink Forward, Tech Conference, Flink, Apache Flink

Ververica Contact

 

 

 

 

Topics: Flink Forward

Related articles

Comments

Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Dongjie Shi & Jiaming Song September 21, 2020

Intel’s distributed Model Inference platform presented at Flink Forward

Flink Forward Global Virtual Conference 2020 is kicking off next month and the Flink community is getting ready to discuss the future of stream processing, and Apache Flink. This time, the...

Read More
by Jark Wu & Qingsheng Ren September 10, 2020

A deep dive on Change Data Capture with Flink SQL during Flink Forward

Can you believe that Flink Forward Global Virtual Conference 2020 is only a few weeks away? 

Read More
by Konstantin Knauf August 04, 2020

Introducing Ververica Platform 2.2 with Autoscaling for Apache Flink

The latest release of Ververica Platform introduces autoscaling for Apache Flink and support for Apache Flink 1.11

Read More