Intel’s distributed Model Inference platform presented at Flink Forward

September 21, 2020 | by Dongjie Shi & Jiaming Song

Flink Forward Global Virtual Conference 2020 is kicking off next month and the Flink community is getting ready to discuss the future of stream processing, and Apache Flink. This time, the conference is free to attend (Oct. 21-22, 2020), while the organizer has put together six hands-on, instructor-led training sessions, Oct. 19 & 20 that cover multiple aspects of Apache Flink, such as: 

  • Flink Development (2 days)

  • SQL Development (2 days)

  • Runtime & Operations (1 day)

  • Stateful Functions (1 day)

  • Tuning & Troubleshooting (introduction and advanced, 1 day each)

We feel very lucky to be presenting how the team at Intel utilizes Apache Flink in our Cluster Serving service inside Intel’s Analytics Zoo. Read through for a sneak preview of our Flink Forward session, Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo, October 21, 2020. If you haven’t done so already, make sure to register and secure your spot before October 1 and  get ready to hear some interesting talks around the latest technology developments and Flink use cases from companies across different industries, company sizes and locations!

Cluster Serving- Distributed Model Inference using Apache Flink in Analytics Zoo

Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo

As deep learning projects evolve from experimentation to production, there is increased demand to deploy deep learning models for large scale, real time distributed inference. While there are many available tools for different tasks, such as model optimization, model serving, cluster scheduling, cluster management and more, deep learning engineers and scientists are still faced with the challenging process of deploying and managing distributed inference workflows that can scale out to large clusters in an intuitive and transparent way.

What we are going to cover

Our session is going to introduce the two major areas for the successful delivery of model serving: Big Data coupled with Machine Learning. Once a model is trained, serving the model intuitively becomes an important task in building  Machine Learning pipelines. In a model serving scenario, the two major benchmarks we should be looking at are latency and throughput. 

To address the demand for extreme low latency model serving in Machine Learning pipelines, we developed Cluster Serving: an automatic and distributed serving solution in Intel’s Analytics Zoo. Cluster Serving takes advantage of Flink's streaming runtime, with its low latency, continuous processing engine. In a similar way, and in order to address the demand for high throughput, Cluster Serving implements batch processing in Flink’s unified data processing engine. Besides, Cluster serving provides support for a wide range of deep learning models, such as TensorFlow, PyTorch, Caffe, BigDL, and OpenVINO. Our model serving solution supports a simple publish-subscribe (pub/sub) API, that allows users to easily send their inference requests to the input queue using a simple Python or http API.

In our session we are going to introduce Cluster Serving and its architecture design to the Flink Forward audience and discuss the underlying design patterns and tradeoffs to deploy and manage deep learning models to distributed, Big Data and unified data processing engines in production. We will additionally showcase some real-world use cases, experiences and examples from our users that have adopted Cluster Serving to develop and deploy their distributed inference workflows. 


What you will learn

In our session you will get a deep understanding of Cluster Serving’s implementation design around Apache Flink and its core features. Additionally we are going to cover some integrations, such as our Redis data pipeline as well as different API designs before we open the floor for a discussion and opinion sharing with the audience.

By attending our session, you will get a hands-on introduction to Cluster Serving in Intel’s Analytics Zoo and how it works to address distributed, real time model service at large scale. You will also see real world use cases of how companies utilize our platform and you will see first-hand how Apache Flink is utilized under the hood to support distributed model inference and serving. Some key learnings from our session will be how to: 

  1. Parallelize expensive operations

  2. Use data pipelines like message queues for parallel sources

  3. Minimize data transfers in a Machine Learning pipeline with Apache Flink


Cluster Serving - Distributed Model Inference using Apache Flink in Analytics Zoo


Registration to the Flink Forward Global Virtual Conference 2020 is free and you can join from anywhere in the world! We look forward to virtually meeting the Apache Flink community at the end of October!

Flink Forward, Tech Conference, Flink, Apache Flink

Ververica Contact





Topics: Flink Forward

Related articles


Sign up for Monthly Blog Notifications

Please send me updates about products and services of Ververica via my e-mail address. Ververica will process my personal data in accordance with the Ververica Privacy Policy.

Our Latest Blogs

by Victor Xu July 13, 2021

Troubleshooting Apache Flink with Byteman


What would you do if you need to see more details of some Apache Flink application logic at runtime, but there's no logging in that code path? An option is modifying the Flink source...

Read More

Announcing Ververica Platform 2.5 for Apache Flink 1.13

New release includes full support for Apache Flink 1.13, with greatly expanded streaming SQL, new performance monitoring, and many new application management features.

Read More
by Nico Kruber May 11, 2021

SQL Query Optimization with Ververica Platform 2.4

In my last blog post, Simplifying Ververica Platform SQL Analytics with UDFs, I showed how easy it is to get started with SQL analytics on Ververica Platform and leverage the power of user-defined...

Read More