Skip to main content
Skip to content
Ververica

Real-time AI in SQL and Expert-Level Control In Ververica Cloud and BYOC

Jaime López
Jaime López

Director of Product Excellence

6 min read

Ververica created Apache Flink®. We wrote the standard. And for over a decade, we have been solving what others couldn't: the edge cases that break other data platforms. Now, we’re raising the bar again with new features for our Fully-Managed Cloud and BYOC deployments of Ververica’s Unified Streaming Data Platform: AI-Native SQL. A more robust lakehouse ingestion path. Operator-level control that puts power in the hands of production teams.

This release sets the new baseline for data streaming platforms: governed, absolute truth, delivered at the speed of now for cloud deployments.

SQL Becomes AI-Native

Calling AI from a streaming pipeline used to mean writing UDFs, managing HTTP clients, and handling retries by hand. The updated VERA engine introduces seven new AI functions that run natively in Flink SQL. Configure your model provider once. The pipeline does the rest.Seven functions, zero custom plumbing:

  • AI_CLASSIFY routes text into categories of your choosing with a confidence score.
  • AI_SENTIMENT scores polarity and labels input as positive, negative, or neutral.
  • AI_EXTRACT pulls structured fields from unstructured text against a JSON Schema you define.
  • AI_SUMMARIZE compresses long text to a length you specify.
  • AI_EMBED turns text into vectors for similarity and semantic retrieval.
  • AI_TRANSLATE handles mutual translation across more than 10 languages with automatic source detection.
  • AI_MASK identifies and masks sensitive fields automatically. In stream.

The last one is particularly important for regulated workloads serving industries like finance. With AI_MASK, sensitive data gets redacted in motion, not after a batch runs hours later. That’s the difference between a fully compliant pipeline and a violation waiting to be audited.

For years, batch and streaming engines have made you choose: approximate counts that are fast and wrong, or exact counts that are slow and expensive. Ververica refuses this trade. A native Bitmap type with cardinality functions delivers exact deduplication in stream. Unique visitors, distinct events, audience overlap. All counted, not estimated, at speed.

  • Postgres and MongoDB sources get full YAML support: Apache Paimon™ is now a first-class source inside Apache Flink Change Data Capture (CDC) YAML, closing the loop on the Streamhouse™ pattern.
  • Catalog-aware references: YAML jobs can reference source and sink tables already registered in your catalog. No more duplicating definitions across pipeline configs and catalog entries.
  • Regex-based routing: Standard regular expressions now drive Route table logic. Table merging and database sharding stop being a custom code problem.
  • Dirty data handling: JSON parsing now isolates malformed records instead of killing the job. Empty schema fault tolerance lets Apache Kafka® CDC YAML create tables with empty schemas when no source data is present. Jobs start. They wait. They do not fail before they begin.

These are not vanity updates. They are the changes that decide whether your pipeline survives its first production weekend.

Connector Enhancements

This release includes new connectors for CosmosDB and MongoDB Catalog. The Kafka Canal-JSON format now parses source-database index events and exposes raw key and value as metadata, and Flink removes redundant consumer groups. The Paimon Sink supports cross-partition upsert. MongoDB sinks support partial update with better ObjectId handling. Redis gets stronger cluster mode and batched writes. Elasticsearch picks up explicit doc_as_upsert and a connection timeout parameter.

Operational Depth For Those Who Know What Difficult Looks Like

Serious streaming users require per-operator control to finely tune their deployments. Ververica offers it now.

This is where the VERA engine gets serious about giving fine-grained control to those who run streaming jobs at scale.

  • Expert Mode for resource allocation: Dynamic resource allocation now supports per-Slot-Sharing-Group configuration. For complex deployments, this is the difference between linear scaling and a wall at 60% utilization.
  • Operator-level state TTL: A single uniform TTL punishes both ends of the spectrum. Fast operators bloat state. Slow operators lose history. Now you configure TTL per operator. Logistics data keeps its memory. High-velocity operators stay lean.
  • State compatibility for SQL deployments: Resume from the latest state and VERA detects schema and topology changes automatically. The results? Less downtime. Less re-bootstrapping. Less manual surgery.
  • JDK 17 support arrives in VERA, making it available across all deployment modes. Pick the new vera-4.5-jdk17-flink-1.20 engine version when your operators, libraries, and downstream tooling are ready.
Figure One: State Compatibility for SQL Deployments

SQL Console

Until now, ad-hoc SQL queries in Ververica’s Managed Cloud and BYOC deployments meant creating a streaming job, waiting for it to start, and reading the result from the deployment logs. The new SQL Console gives ad-hoc work a dedicated environment. Run CALL, DDL, DQL, DML, and EXPLAIN in the same place. Create and manage catalogs and tables. Operate on Paimon tables directly. Inspect execution plans without leaving the script.

Figure Two: Ververica Platform SQL Scripts editor showing a CreateCatalog script with Azure Data Lake Storage configuration

It also closes the gap with Apache Spark™ and Delta Lake on data lake operations. For teams using SQL as their primary control surface, the operator running an incident at midnight no longer needs a deploy to ask a question.

This is the New Baseline

Real-time AI belongs in SQL. The lakehouse belongs in the runtime. Control belongs with the operator.

For regulated industries, the math is simple. You need all of your data governed, masked, retained, and queryable. This release provides it all.

Resources

Share:LinkedIn

While the World Buffers, We Act.

We tore down the facade. With No Mercy Magenta and a new voice we challenge 'real-time' pretenders. We are the authoritative operator for sovereign, low-latency AI. The world is buffering. We are not.

Fabian Wilckens4 min read