Advanced Inference Design Patterns

Last modified: April 18, 2024

1 Introduction

The Integrating Models with Pre-processors and Post-processors section of Integrate Machine Learning Models outlines considerations when importing a machine learning model with advanced processing needs. What are the standards for these models, and what do they look like?

This document explores four common advanced inference design patterns for machine learning models. These include the following:

1.1 Ensembles

Ensemble models are used when dealing with a lot of variance on a dataset or many features versus a relatively low number of data available. Ensemble models are a machine learning approach to combine multiple other models, called base estimators, in the prediction process. Ensemble models offer a solution to overcome the technical challenges of building a single estimator. In this approach, the same data points are sent to a group of models and then collect all the predictions to find the best prediction.

.

You can create ensemble models in Mendix building a separate microflow for each model, then combine the predictions in another microflow. The example shows an ensemble of two models.

An example of a domain model of an ensemble model:

.

An example of the sample microflow:

.

1.2 Cascaded Inference

The Cascaded Inference pattern refers to the ability to feed the output of one model into another in a cascade pattern.

It is used to compensate for a model bias, or incomplete data, in such a way you could use another predictor to compensate for that. In this case, a potential implementation looks pretty much like a graphical representation of this pattern:

An example of a microflow:

.

A model pre-processor makes some data available for the first model, and the output is injected into the second model as an input. Ultimately, that output is used for the final prediction.

1.3 Machine Learning MaaS (Model as a Service)

A common pattern in machine learning deployment is using a microservice or a service. While Studio Pro supports monolith applications with its security and speed advantages, creating a microservice is possible by servers publishing a REST service and clients calling the service.

In this way, the AI-powered smart app can be split into two Mendix apps: one to host the ML model, and one to process and use the predictions. This is a good approach for use cases where the ML model is complex and requires heavy computing power, or when the ML model is owned and maintained by another team. Another advantage is that you can update the ML model without the need for deploying the Mendix client app.

Below is an example of such deployment. Instead of actually storing the variable after predicting the elements in an image, the variable is encoded as JSON and then published.

Sample microflow for a Maas, as explained in the paragraph above.

1.4 Batch Inference

A common pattern for machine learning applications is the ability to run multiple inferences with a single request for the model, or batch inference. This is just a special case of Dynamic Shapes, in which the first dimension is dynamic:

Mapping of a ResNet50 with first parameter dynamic.

You can add 1 as the first element and the model will work with a batch size of 1, or whatever figure you desire and work with any elements at the time:

ResNet50 with a batch size of 10.

Adjust your pre/post processor to send/receive the correct batch size.