Webinar
Data Lineage in Data Streaming

Data lineage is no longer a “nice to have” for modern engineering teams: it’s essential. With streaming technologies like Kafka and Flink powering critical real-time workloads, teams face growing complexity in understanding how data moves, transforms, and interrelates across systems. Debugging failures, ensuring compliance, and building trust in data can all be slow and painful without clear data lineage.
In this first half, we'll focus on the blueprint. We will demonstrate how to build a unified, end-to-end lineage graph with OpenLineage, showing you how to stitch together complex Kafka, Flink, and Spark workloads. This provides the foundational visibility every modern data platform needs.
That technical foundation is crucial, because in the second half, we'll see the payoff. We will go deep into the business implications for the Customer Data Platform and show how this is operationalized at scale on the Factor House and Ververica Platforms.
What You’ll Discover:
- Why regulatory and compliance requirements in industries like banking and financial services make proven data lineage not just valuable, but mandatory
- The fundamentals of data lineage: what it is, why it matters, and how it supports compliance, governance, and operational trust
- How lineage differs in streaming environments compared to batch, and the unique challenges this creates
- Practical demonstrations of lineage in action across Kafka via connectors, Flink jobs, and Spark pipelines
- Cross-technology best practices for operationalising end-to-end lineage
Ben Gamble
.png?width=250&name=ben_gamble%20(1).png)
Speaker
Ben Gamble, Director of Product Marketing: With a background in engineering leadership and entrepreneurship, Ben led teams across logistics, gaming, and mobile apps, focusing on solutions involving GPS, AR, and multi-user collaboration
Jaehyeon Kim

Speaker
Register now
Platform: Zoom
Timezones: 12:30PM CEST
Date: Thursday, 2 October 2025