Reducing Deployment Downtime
Introduction
Kubernetes allows you to update an app without downtime by performing a rolling update. Instead of stopping an app and then starting it with an updated configuration, Kubernetes can replace pods (replicas) one by one.
The Mendix on Kubernetes Operator uses a recreate strategy by default. That is, the current version (configuration) of an app stops, and then the new version starts.
Starting from version 2.24.0, the Operator will automatically perform a Rolling update for any environment that meets the prerequisites:
- The environment has two or more replicas.
- The configuration update does not modify the app source code (MDA or container image).
In addition Operator version 2.24.0 will automatically assign a PodDisruptionBudget to environments with 2 or more replicas. Any environment with two or more replicas will be configured with a PodDisruptionBudget that ensures that no more than 1 replicas are stopped by Kubernetes when scaling down a cluster node or preparing an OS upgrade.
Previous versions of the Operator did not manage PodDisruptionBudgets. Instead, any manually created PodDisruptionBudget would apply to a Mendix app.
If you have manually created PodDisruptionBudgets for an app, delete it and instead specify the PodDisruptionBudget parameters in the MendixApp CR.
Prerequisites
Prerequisites for Operator version 2.24.0 and Higher
The Operator automatically performs a Rolling update for any environment that meets the following conditions:
- The environment has two or more replicas.
- The configuration update does not modify the app source code (MDA or container image).
How the Operator Chooses a Deployment Strategy
If any of the following conditions is true, the Operator always uses a Recreate strategy, performing a full stop of all of the app's replicas:
- There are app pods that are running a different (older) version of the app image: there are changes in the app MDA or base OS image.
- The app environment has one replica.
Otherwise, the Operator performs a Rolling update automatically.
As a Rolling strategy can run multiple versions of the app at the same time, requests from the browser must be routed to a matching app version (that is, an app that has the same microflow or nanoflow parameters). The Operator uses Kubernetes service labels to perform an atomic switch, and instantly switch all clients to the updated version. This is done automatically once the number of updated replicas reaches a certain threshold. By default the threshold is 50% of all replicas. The value is specified in the switchoverThreshold parameter.
Use Cases
Whether a change can be performed without downtime depends on the type of the change. For example, the following changes can be done without downtime:
- Changing app constants, MxAdmin password or debugger settings
- Changing environment variables, Runtime or Java options
- Changing Runtime Metrics settings
- Upgrading the Mendix Operator version
The following changes will cause a full restart and downtime:
- Any changes that cause a modified MDA file
- Rebuilding the same MDA version with a different base image version (e.g. switching to another Java version or installing the latest CVE patches)
Configuring the Deployment Strategy parameters in Standalone Environments
To reduce deployment downtime, add the deploymentStrategy section to your MendixApp CR, as in the following example:
apiVersion: privatecloud.mendix.com/v1alpha1
kind: MendixApp
metadata:
# ...
# omitted lines for brevity
# ...
spec:
  # ...
  # omitted lines for brevity
  # ...
  # Add or update this section:
  deploymentStrategy:
    switchoverThreshold: 50%
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 50%For more information on the MendixApp CR, see Editing CR.
You can specify the following options:
- switchoverThreshold – Specifies a threshold of updated, ready replicas after which all clients should switch to the updated version. The threshold can be a percentage or an absolute value. For example, setting this to 50% will switch all clients to the updated app version once 50% of all replicas are running the updated version. If not otherwise specified, 50% is used as the default value. This option is only used if the strategy type is set to PreferRolling.
- rollingUpdate - Specifies parameters for rolling updates if the Operator is able to perform the update without a restart. These parameters are used as Kubernetes rollingUpdate parameters:
- maxSurge – Specifies an absolute or percentage value for how many additional replicas can be added during the deployment process. The default 0 value means that no additional replicas are added during the rollout process, and instead existing replicas are stopped to avoid using additional cluster resources.
- maxUnavailable – Specifies an absolute or percentage value for how many replicas can be stopped to be replaced with updated versions during the rollout process. The default 1 value means that at most one replicas would be stopped during the update process. Increasing this value speeds up the rollout process, but can cause performance issues.
 
Configuring Pod Disruption Budget parameters in Standalone Environments
Kubernetes can stop an app's pods if needed to stop a node (to scale down and consolidate apps to run on fewer nodes), or perform a node update (for example, install CVE patches on the host OS). Starting from Mendix Operator version 2.24.0, you can specify parameters for a PodDisruptionBudget of an app to ensure that Kubernetes only stops a limited number of an app's pods, and if necessary waits for replacement pods to become available.
To manually configure parameters for a PodDisruptionBudget, add the podDisruptionBudget section to your MendixApp CR, as in the following example:
apiVersion: privatecloud.mendix.com/v1alpha1
kind: MendixApp
metadata:
# ...
# omitted lines for brevity
# ...
spec:
  # ...
  # omitted lines for brevity
  # ...
  # Add or update this section:
  podDisruptionBudget:
    # Kubernetes doesn't allow specifying both maxUnavailable and minAvailable at the same time:
    # https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
    maxUnavailable: 1 # Example: do not disrupt more than 1 pod at the same time
    # minAvailable: 50% # Example: make sure that at least 50% of pods are availableYou can specify the following options:
- maxUnavailable – Specifies an absolute or percentage value for how many replicas can be stopped if Kubernetes needs to evict them from a node. The default 1 value means that at most 1 replica can be stopped, and that Kubernetes needs to wait until a replacement replica becomes available. Increasing this value speeds up the rollout process, but can cause performance issues.
- minAvailable – Specifies an absolute or percentage value for how many replicas need to be remain available if Kubernetes needs to evict them from a node. Increasing this value slows down the rollout process, but ensures that less replicas can be disrupted.
maxUnavailable and minAvailable, and specifying values for both of them will result in an error.
      Limitations
- This feature is only supported by Mendix Operator version 2.24 (and later). Mendix Operator versions 2.20.0 to 2.23.1 used to have an experimental implementation of this feature; upgrading to 2.24.0 or later is highly recommended.
- Deploying a new version of the app will cause downtime if there are any changes in the app MDA or the base OS image.
- To ensure that scheduled events are correctly synchronized at startup, it is recommended to use Mendix 10.20 or later.