Skip to main content
Skip to content
Ververica
One query. Stream and table. Doesn't matter.

Streamhouse™. Dual pipelines are done.

The streaming lakehouse from the original creators of Apache Flink®. One architecture for streams and tables, real-time and historical, ingestion and analytics. One SQL engine. One governed source of truth.

What Is Streamhouse?

Streamhouse™ treats every stream as a queryable table and every table as a subscribable stream. One architecture runs ingestion, real-time processing, historical analytics, and AI inference on one governed set of data. Built on Apache Flink® for compute, Apache Fluss™ for low-latency streaming storage, and open table formats for the lake. Union Read crosses both tiers in a single query. Sub-second freshness when the business needs it. Lake economics when it does not.

One query. Stream and table. Doesn't matter.

STREAMHOUSE™ STACK

LayerTechnology
Compute

VERA / Apache Flink®

Streaming storage

Apache Fluss™

Lakehouse storage

Any open table format (e.g. Iceberg)

Ingestion

Flink CDC

Query interface

Flink SQL

Governance

Built in

One house. Four layers. Zero compromise.

streamhouse architecture

Streamhouse Architecture

MATERIALIZED TABLES ON VERA

APPLICATION + COMPUTE Declarative SQL. Freshness-driven refresh. VERA runs stream and batch on one engine, 2x faster than open-source Apache Flink®.

APACHE FLUSS™

STREAMING STORAGE Low-latency columnar streaming storage. Real-time KV and log, columnar tables, lakehouse-native architecture.

OPEN TABLE FORMATS

LAKEHOUSE STORAGE Your object store. Your format. Tiering and Union Read cross streaming and lake storage in one query.

AUTOPILOT + WORKFLOW SCHEDULER

OPERATIONS Autopilot scales parallelism in real time. The Workflow Scheduler triggers batch on a freshness threshold or cron. Deterministic capacity, no surprises.

What holds Streamhouse™ up.

streamhouse

Two stacks. Twice the cost. Half the answers.

Dual pipelines

Dual pipelines multiply everything that can break. Streaming for real-time. Lakehouse for analytics. Two pipelines. Two storage layers. Two query engines. Two governance models. Two on-call rotations. Data drifts. Costs compound. Engineers leave.

Slow Lakehouses

Lakehouses are too slow for what comes next. Iceberg and Delta tables are minutes-to-hours fresh. Acceptable for BI. Fatal for fraud detection, real-time pricing, and agentic AI. Batch was never going to power the next decade.

expensive streaming

Pure streaming is too expensive for history. Keeping 90 days of data hot in Kafka is a budget line item nobody wants. And Kafka is not queryable. You cannot SELECT against a topic. So you build a separate lake to compensate. Now you have two systems again.

Numbers. Not narratives.

The "WHY" You should have it.

0x
WRITE PERFORMANCEFaster writes than a traditional batch lakehouse.
0x
QUERY PERFORMANCEFaster analytical queries with streaming preprocessing.
0x
COMPUTE THROUGHPUTVERA against open-source Apache Flink®
0%
CHEAPER THAN KAFKAApache Fluss™ stateless compute and tiered storage

Pick a stack. Pick a compromise.

FeatureSTREAMING ONLYCLASSIC LAKEHOUSESTREAMHOUSE™
Freshness

Sub-second

Minutes to hours

Sub-second to seconds

Historical queries

Expensive replay

Yes

Yes, in the same query

Storage cost

High, hot retention

Low

Low, automatic tiering

Query interface

Custom consumers

SQL, batch only

Flink SQL, stream and batch

Governance

Separate from lake

Lake only

One control plane

Lock-in

Vendor runtime

Open formats

Open all the way down

While the world buffers, we act

Real-time AI for a world in motion. Built on Streamhouse™. Built by the original creators of Apache Flink®.

Streamhouse - Unified Real-Time Data Architecture | Ververica