Intel’s distributed Model Inference platform presented at Flink Forward

September 21, 2020 | by Dongjie Shi & Jiaming Song

Flink Forward Global Virtual Conference 2020 is kicking off next month and the Flink community is getting ready to discuss the future of stream processing, and Apache Flink. This time, the conference is free to attend (Oct. 21-22, 2020), while the organizer has put together six hands-on, instructor-led training sessions, Oct. 19 & 20 that cover multiple aspects of Apache Flink, such as: 

  • Flink Development (2 days)

  • SQL Development (2 days)

  • Runtime & Operations (1 day)

  • Stateful Functions (1 day)

  • Tuning & Troubleshooting (introduction and advanced, 1 day each)

We feel very lucky to be presenting how the team at Intel utilizes Apache Flink in our Cluster Serving service inside Intel’s Analytics Zoo. Read through for a sneak preview of our Flink Forward session, Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo, October 21, 2020. If you haven’t done so already, make sure to register and secure your spot before October 1 and  get ready to hear some interesting talks around the latest technology developments and Flink use cases from companies across different industries, company sizes and locations!

Cluster Serving- Distributed Model Inference using Apache Flink in Analytics Zoo

Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo

As deep learning projects evolve from experimentation to production, there is increased demand to deploy deep learning models for large scale, real time distributed inference. While there are many available tools for different tasks, such as model optimization, model serving, cluster scheduling, cluster management and more, deep learning engineers and scientists are still faced with the challenging process of deploying and managing distributed inference workflows that can scale out to large clusters in an intuitive and transparent way.

What we are going to cover

Our session is going to introduce the two major areas for the successful delivery of model serving: Big Data coupled with Machine Learning. Once a model is trained, serving the model intuitively becomes an important task in building  Machine Learning pipelines. In a model serving scenario, the two major benchmarks we should be looking at are latency and throughput. 

To address the demand for extreme low latency model serving in Machine Learning pipelines, we developed Cluster Serving: an automatic and distributed serving solution in Intel’s Analytics Zoo. Cluster Serving takes advantage of Flink's streaming runtime, with its low latency, continuous processing engine. In a similar way, and in order to address the demand for high throughput, Cluster Serving implements batch processing in Flink’s unified data processing engine. Besides, Cluster serving provides support for a wide range of deep learning models, such as TensorFlow, PyTorch, Caffe, BigDL, and OpenVINO. Our model serving solution supports a simple publish-subscribe (pub/sub) API, that allows users to easily send their inference requests to the input queue using a simple Python or http API.

In our session we are going to introduce Cluster Serving and its architecture design to the Flink Forward audience and discuss the underlying design patterns and tradeoffs to deploy and manage deep learning models to distributed, Big Data and unified data processing engines in production. We will additionally showcase some real-world use cases, experiences and examples from our users that have adopted Cluster Serving to develop and deploy their distributed inference workflows. 


What you will learn

In our session you will get a deep understanding of Cluster Serving’s implementation design around Apache Flink and its core features. Additionally we are going to cover some integrations, such as our Redis data pipeline as well as different API designs before we open the floor for a discussion and opinion sharing with the audience.

By attending our session, you will get a hands-on introduction to Cluster Serving in Intel’s Analytics Zoo and how it works to address distributed, real time model service at large scale. You will also see real world use cases of how companies utilize our platform and you will see first-hand how Apache Flink is utilized under the hood to support distributed model inference and serving. Some key learnings from our session will be how to: 

  1. Parallelize expensive operations

  2. Use data pipelines like message queues for parallel sources

  3. Minimize data transfers in a Machine Learning pipeline with Apache Flink


Cluster Serving - Distributed Model Inference using Apache Flink in Analytics Zoo


Registration to the Flink Forward Global Virtual Conference 2020 is free and you can join from anywhere in the world! We look forward to virtually meeting the Apache Flink community at the end of October!

Flink Forward, Tech Conference, Flink, Apache Flink

Ververica Contact





Topics: Flink Forward

Related articles


Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Frédérique Mittelstaedt October 19, 2021

Keeping Redditors safe with Stateful Functions - Flink Forward 2021

London-based Frédérique Mittelstaedt leads the real-time safety applications team at Reddit. His team keeps Redditors safe by automating the detection and actioning of harmful user behaviour and...

Read More
by Brent Davis October 06, 2021

Building Apache Flink Streaming at Splunk - Flink Forward 2021

Brent Davis, Principal Performance Engineer at Splunk, will deliver a technical session on  Sources, Sinks, and Operators: A Performance Deep Dive on October 27 at the upcoming Flink Forward...

Read More
by Chen Qin September 21, 2021

The Apache Flink Story at Pinterest - Flink Forward Global 2021

On October 27, at the annual Apache Flink user conference, Flink Forward Global 2021, Pinterest Tech Lead, Chen Qin will deliver a keynote talk on “Sharing what we love: The Apache Flink Story at...

Read More