Intel’s distributed Model Inference platform presented at Flink Forward

September 21, 2020 | by Dongjie Shi & Jiaming Song

Flink Forward Global Virtual Conference 2020 is kicking off next month and the Flink community is getting ready to discuss the future of stream processing, and Apache Flink. This time, the conference is free to attend (Oct. 21-22, 2020), while the organizer has put together six hands-on, instructor-led training sessions, Oct. 19 & 20 that cover multiple aspects of Apache Flink, such as: 

  • Flink Development (2 days)

  • SQL Development (2 days)

  • Runtime & Operations (1 day)

  • Stateful Functions (1 day)

  • Tuning & Troubleshooting (introduction and advanced, 1 day each)

We feel very lucky to be presenting how the team at Intel utilizes Apache Flink in our Cluster Serving service inside Intel’s Analytics Zoo. Read through for a sneak preview of our Flink Forward session, Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo, October 21, 2020. If you haven’t done so already, make sure to register and secure your spot before October 1 and  get ready to hear some interesting talks around the latest technology developments and Flink use cases from companies across different industries, company sizes and locations!

Cluster Serving- Distributed Model Inference using Apache Flink in Analytics Zoo

Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo

As deep learning projects evolve from experimentation to production, there is increased demand to deploy deep learning models for large scale, real time distributed inference. While there are many available tools for different tasks, such as model optimization, model serving, cluster scheduling, cluster management and more, deep learning engineers and scientists are still faced with the challenging process of deploying and managing distributed inference workflows that can scale out to large clusters in an intuitive and transparent way.

What we are going to cover

Our session is going to introduce the two major areas for the successful delivery of model serving: Big Data coupled with Machine Learning. Once a model is trained, serving the model intuitively becomes an important task in building  Machine Learning pipelines. In a model serving scenario, the two major benchmarks we should be looking at are latency and throughput. 

To address the demand for extreme low latency model serving in Machine Learning pipelines, we developed Cluster Serving: an automatic and distributed serving solution in Intel’s Analytics Zoo. Cluster Serving takes advantage of Flink's streaming runtime, with its low latency, continuous processing engine. In a similar way, and in order to address the demand for high throughput, Cluster Serving implements batch processing in Flink’s unified data processing engine. Besides, Cluster serving provides support for a wide range of deep learning models, such as TensorFlow, PyTorch, Caffe, BigDL, and OpenVINO. Our model serving solution supports a simple publish-subscribe (pub/sub) API, that allows users to easily send their inference requests to the input queue using a simple Python or http API.

In our session we are going to introduce Cluster Serving and its architecture design to the Flink Forward audience and discuss the underlying design patterns and tradeoffs to deploy and manage deep learning models to distributed, Big Data and unified data processing engines in production. We will additionally showcase some real-world use cases, experiences and examples from our users that have adopted Cluster Serving to develop and deploy their distributed inference workflows. 


What you will learn

In our session you will get a deep understanding of Cluster Serving’s implementation design around Apache Flink and its core features. Additionally we are going to cover some integrations, such as our Redis data pipeline as well as different API designs before we open the floor for a discussion and opinion sharing with the audience.

By attending our session, you will get a hands-on introduction to Cluster Serving in Intel’s Analytics Zoo and how it works to address distributed, real time model service at large scale. You will also see real world use cases of how companies utilize our platform and you will see first-hand how Apache Flink is utilized under the hood to support distributed model inference and serving. Some key learnings from our session will be how to: 

  1. Parallelize expensive operations

  2. Use data pipelines like message queues for parallel sources

  3. Minimize data transfers in a Machine Learning pipeline with Apache Flink


Cluster Serving - Distributed Model Inference using Apache Flink in Analytics Zoo


Registration to the Flink Forward Global Virtual Conference 2020 is free and you can join from anywhere in the world! We look forward to virtually meeting the Apache Flink community at the end of October!

Flink Forward, Tech Conference, Flink, Apache Flink

Ververica Contact





Topics: Flink Forward

Related articles


Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Nick Hwang & Manuela Wei January 26, 2021

How Intuit Built a Self-serve Stream Processing Platform with Flink

At Flink Forward Global 2020, we presented how we on Intuit’s Data Platform team developed an internal, self-serve stream processing platform with Apache Flink as the primary stream processing...

Read More
by Feng Wang December 21, 2020

Apache Flink's stream-batch unification powers Alibaba's 11.11 in 2020

Apache Flink, Ververica Platform and Alibaba Cloud's RealTime Compute technology stack processes record breaking real time data during this year's Double 11!

Wondering how Apache Flink, Ververica...

Read More
by Shashank Agarwal December 15, 2020

Flink-powered model serving & real-time feature generation at Razorpay

During Flink Forward Global 2020 the team from Razorpay showcased how Apache Flink is being utilized in it’s ‘Mitra’ Data Platform as a way to overcome challenges around feature generation and...

Read More