One query. Stream and table. Doesn't matter.

Streamhouse™. Dual pipelines are done.

The streaming lakehouse from the original creators of Apache Flink®. One architecture for streams and tables, real-time and historical, ingestion and analytics. One SQL engine. One governed source of truth.

Talk to sales Book a demo

What Is Streamhouse?

Streamhouse™ treats every stream as a queryable table and every table as a subscribable stream. One architecture runs ingestion, real-time processing, historical analytics, and AI inference on one governed set of data. Built on Apache Flink® for compute, Apache Fluss™ for low-latency streaming storage, and open table formats for the lake. Union Read crosses both tiers in a single query. Sub-second freshness when the business needs it. Lake economics when it does not.

One query. Stream and table. Doesn't matter.

STREAMHOUSE™ STACK

Layer	Technology
Compute	VERA / Apache Flink®
Streaming storage	Apache Fluss™
Lakehouse storage	Any open table format (e.g. Iceberg)
Ingestion	Flink CDC
Query interface	Flink SQL
Governance	Built in

One house. Four layers. Zero compromise.

streamhouse architecture — Streamhouse Architecture

MATERIALIZED TABLES ON VERA

APPLICATION + COMPUTE Declarative SQL. Freshness-driven refresh. VERA runs stream and batch on one engine, 2x faster than open-source Apache Flink®.

APACHE FLUSS™

STREAMING STORAGE Low-latency columnar streaming storage. Real-time KV and log, columnar tables, lakehouse-native architecture.

OPEN TABLE FORMATS

LAKEHOUSE STORAGE Your object store. Your format. Tiering and Union Read cross streaming and lake storage in one query.

AUTOPILOT + WORKFLOW SCHEDULER

OPERATIONS Autopilot scales parallelism in real time. The Workflow Scheduler triggers batch on a freshness threshold or cron. Deterministic capacity, no surprises.

What holds Streamhouse™ up.

Two stacks. Twice the cost. Half the answers.

Dual pipelines

Dual pipelines multiply everything that can break. Streaming for real-time. Lakehouse for analytics. Two pipelines. Two storage layers. Two query engines. Two governance models. Two on-call rotations. Data drifts. Costs compound. Engineers leave.

Slow Lakehouses

Lakehouses are too slow for what comes next. Iceberg and Delta tables are minutes-to-hours fresh. Acceptable for BI. Fatal for fraud detection, real-time pricing, and agentic AI. Batch was never going to power the next decade.

expensive streaming

Pure streaming is too expensive for history. Keeping 90 days of data hot in Kafka is a budget line item nobody wants. And Kafka is not queryable. You cannot SELECT against a topic. So you build a separate lake to compensate. Now you have two systems again.

Numbers. Not narratives.

The "WHY" You should have it.

0x: WRITE PERFORMANCEFaster writes than a traditional batch lakehouse.
0x: QUERY PERFORMANCEFaster analytical queries with streaming preprocessing.
0x: COMPUTE THROUGHPUTVERA against open-source Apache Flink®
0%: CHEAPER THAN KAFKAApache Fluss™ stateless compute and tiered storage

Pick a stack. Pick a compromise.

Feature	STREAMING ONLY	CLASSIC LAKEHOUSE	STREAMHOUSE™
Freshness	Sub-second	Minutes to hours	Sub-second to seconds
Historical queries	Expensive replay	Yes	Yes, in the same query
Storage cost	High, hot retention	Low	Low, automatic tiering
Query interface	Custom consumers	SQL, batch only	Flink SQL, stream and batch
Governance	Separate from lake	Lake only	One control plane
Lock-in	Vendor runtime	Open formats	Open all the way down

While the world buffers, we act

Real-time AI for a world in motion. Built on Streamhouse™. Built by the original creators of Apache Flink®.

Talk to sales Book a demo