Webinar

Data Lineage in Data Streaming

Data Lineage in Data Streaming

Data lineage is no longer a “nice to have” for modern engineering teams: it’s essential. With streaming technologies like Kafka and Flink powering critical real-time workloads, teams face growing complexity in understanding how data moves, transforms, and interrelates across systems. Debugging failures, ensuring compliance, and building trust in data can all be slow and painful without clear data lineage.


In this first half, we'll focus on the blueprint. We will demonstrate how to build a unified, end-to-end lineage graph with OpenLineage, showing you how to stitch together complex Kafka, Flink, and Spark workloads. This provides the foundational visibility every modern data platform needs.


That technical foundation is crucial, because in the second half, we'll see the payoff. We will go deep into the business implications for the Customer Data Platform and show how this is operationalized at scale on the Factor House and Ververica Platforms.



What You’ll Discover:

  • Why regulatory and compliance requirements in industries like banking and financial services make proven data lineage not just valuable, but mandatory
  • The fundamentals of data lineage: what it is, why it matters, and how it supports compliance, governance, and operational trust
  • How lineage differs in streaming environments compared to batch, and the unique challenges this creates
  • Practical demonstrations of lineage in action across Kafka via connectors, Flink jobs, and Spark pipelines
  • Cross-technology best practices for operationalising end-to-end lineage

Ben Gamble

ben_gamble (1)

Speaker

Ben Gamble, Director of Product Marketing: With a background in engineering leadership and entrepreneurship, Ben led teams across logistics, gaming, and mobile apps, focusing on solutions involving GPS, AR, and multi-user collaboration

Jaehyeon Kim

kim_jaehyeon

Speaker

Jaehyeon Kim, Developer Experience (DevEX) Engineer: Jaehyeon leads Developer Experience at Factor House, where he builds tools and platforms that help engineering teams move quickly without sacrificing stability or governance. With expertise in real-time systems and modern data platforms, he has worked extensively with Kafka, Flink, Spark, and the integrated architectures showcased in Factor House Local. As a passionate engineer and writer, he shares practical insights on real-time analytics, data lineage, and building systems that remain resilient, observable, and maintainable at scale.



Register now

Platform: Zoom
Timezones: 12:30PM CEST
Date: Thursday, 2 October 2025