Trends in Mendix Cloud v3

Last update: Edit

1 Introduction

To track the usage growth of your app, and to debug performance problems, the Mendix Cloud includes detailed graphs of both your app and its environment. These graphs show performance trends of your apps in the paid editions of the Mendix Platform. If you experience issues with your app, always check the Alerts and Trends in the Developer Portal under Operate.

To view the Trends, you must have permission to Access the Monitoring.

Assign permission by following these steps:

  1. Open your app in the Developer Portal.
  2. Click Security under the Settings category on the left.
  3. Go to the Node Permissions tab.
  4. Choose the environment for which you want to grant access.
  5. Check Access to Monitoring next to the name of the person to whom you want to grant this permission.

You can find the trends by following these steps:

  1. Open your app in the Developer Portal.
  2. Click Metrics under the Operate category.
  3. Select the environment you want to monitor under the tab Trends.

3.2 Interpreting the Graphs

As with all complex IT systems, there are many interrelated components which can cause performance issues. This document cannot cover all possibilities, but is intended as a general introduction to the information which is displayed and a few ideas about where to look for possible areas of concern.

3.2.1 Scales

The scales are produced automatically by the graphing software. This can lead to unexpected scales.

For example a scale for transactions per seconds may have a value of 30 m. This means 30 milli-transactions per second, 1800 milli-transactions, or 2 transactions per minute.

3.2.2 Disk Utilization

Disk utilization is calculated as the disk usage that is used by the user of the system. Due to operating system overhead and empty space in block size allocation, not all disk space can be fully allocated. For this reason, the total amount of usable space will be about 4% lower than the actual disk space.

3.2.3 Disk Partitions

If there is more than one disk partition in the system, the /srv partition generally contains project files and uploaded files of the application, while /var generally holds the database storage.

3.2.4 Combining Information

You can often get more information about the performance of your app by combining the information from several graphs. Useful graphs for this are:

For example, a combination of a moderate number of IO operations, low disk throughput, visible cpu iowait, full memory disk cache, and reports of long running database queries in the application log could point to a shortage of system memory for disk cache that leads to repeated random reads from disk storage.

3.2.5 More Information

If you would like some more information about how these graphs are created and further suggestions for interpretation, see Monitoring a Mendix application using Munin in the Mendix m2ee-tools GitHub repo.

4 Application Statistics

These graphs show various application-specific metrics, such as the number of HTTP requests, user sessions, JVM memory, and other application performance statistics.

The Number of handled external requests graph shows the number of requests that are sent from the client and systems that integrate with your application using web services.

The number of requests per second is split up by request handlers. The key ones are:

  • / should not list any requests, because static content is directly served to the user by the front-facing web server, which is placed between the user and this application process
  • ws/ shows the number of web service calls that were done
  • xas/ lists general queries for data in data grids, sending changes to the server and triggering the execution of microflows
  • file shows the number of file uploads and downloads

Additional information about request handlers is available in the Requests section of Monitoring Mendix Runtime and the Applying Access Restrictions to Unnecessary Request Handlers section of How To Implement Best Practices for App Security.

In the Object cache graph you can monitor the number of Mendix Objects that live in memory.

Non-persistable entities live in the JVM memory and are garbage collected regularly. If you have a memory leak, the number of objects in memory will grow over time. This might be a problem.

The User Accounts and Login Sessions graph shows the number of logged-in named and anonymous user sessions for your application.

These are the user types:

User Type Explanation
named users Total number of user accounts.
concurrent named user sessions Total number of sessions for users using a named login.
concurrent anonymous user sessions Total number of sessions for users who are signing in anonymously.

The JVM Object Heap graph shows the internal distribution of allocated memory inside the application process for Java objects. Java objects are created in Java actions, but also include all objects that are used by microflows running in your app at runtime.

One of the most important things to know, in order to be able to interpret the values in this graph, is that the JVM does not immediately clean up objects that are no longer in use. This graph will show unused memory as still in use until the so-called garbage collector, which analyzes the memory to free up space, is run. So, you cannot see how much of the JVM memory that is in use before a garbage collection will be available after the garbage collection cycle, because the garbage collection process will only find that out when it actually runs.

There are three sorts of space in the JVM heap, which the garbage collector treats separately to enable it to work efficiently:

  • eden space is for newly-created objects
  • survivor space is where objects are moved if the garbage collector cannot clean them out of eden space
  • tenured generation holds objects which are longer-lived

For example, if the tenured generation is shown as 65% of the complete heap size, this may change to 0% if a garbage collection is triggered when the percentage reaches two thirds of the total heap size. However, it could stay at this 65% if all data in this memory part is still referenced by running actions in the application. This behavior means that the JVM heap memory graphs are the most difficult to base conclusions on.

This JVM Process Memory Usage graph is similar to the previous graph, JVM Object Heap. It shows a more complete view of the actual size and composition of the operating system memory that is in use by the JVM process.

This graph is primarily present to provide more insight in situations where the part of the real used memory outside the JVM Object Heap is growing too much, causing problems with memory shortage in the operating system.

More information on this graph is available in a Tech Blog post: What’s in my JVM memory?

The Application node operating system memory graph shows the distribution of operating system memory that is available for this server.

The most important part of the graph is the category apps which shows the amount of memory that is continuously in use by the application process. Performance issues can arise if the apps memory takes up too large a proportion of the operating system memory or if the committed value exceeds the operating system memory.

The Threadpool for handling external requests graph shows the number of concurrent requests that are being handled by the Mendix Runtime. The requests are counted in two circumstances:

  • they are initiated by a remote API – the way the normal web-based client communicates
  • they are initiated by calling web services

Because creating a new thread that can concurrently process a request is an expensive operation, Mendix holds a pool of threads that can quickly start processing new incoming requests. This pool automatically grows and shrinks according to the number of requests that are flowing through the application.

The Total Number of Threads in the JVM Process graph shows the total number of threads that exist inside the running JVM process.

Besides the threadpool that is used for external HTTP requests, described above, this includes the threadpool used for database connections, internal processes inside the Mendix Runtime, and optional extra threads created by the application itself, for example, using a threadpool in a custom module or custom Java code.

The Application node CPU usage graph shows the CPU utilization in percentage, broken down into different types of CPU usage. Each CPU is counted as 100%, so in a multi-CPU system, the scale will be several hundred percent.

The most important value in here is user, which shows the amount of CPU time used for handling requests at Mendix Runtime and executing microflows and scheduled events.

The Application node disk throughput graph shows the rate at which data which isn’t stored in the database is being read from and written to disk.

The Application node disk usage (in bytes) graph displays the absolute amount of data that is stored on disk.

The Application node disk usage (percentage) graph shows the relative amounts of data that are stored on disk.

This graph should be interpreted in combination with other graphs. See Combining Information, above.

The Application node disk IO/s statistics show the number of disk read and write operations that are done from and to disk storage. It does not show the amount of data that was transferred.

The Application node load is commonly used as a general indication of overall server load that can be monitored and alerted upon.

The load value is a composite value, calculated from a range of other measurements, as shown in the other graphs on this page. If you are investigating high server load, this graph alone is not sufficient.

This value is used in Alerts to signal that the CPU usage is not OK. A warning is issued for extended load higher than 2.8, and critical is signaled for extended load higher than 6.0.

The Application node disk latency graph shows the average waiting times for disk operations to complete.

Interpreting the values in this graph should be done in combination with the other disk stats graphs, and taking the types of requests into consideration. Sequential or random reads and writes can create a different burden for disk storage.

The Application node disk utilization shows the percentage of time that the disk storage is busy processing requests.

This graph should be interpreted in combination with other graphs. See Combining Information, above.

5 Database Statistics

The database statistics show the number of database queries and mutations, the total size of the database, and other performance statistics.

The Number of database queries being executed graph shows the number of database queries that are executed by your Mendix application.

The queries are broken down into queries that actually modify data (insert, update, and delete) and queries that fetch data (select).

The Database table vs. index size graph shows the distribution between disk space used for storing indexes and actual data.

Remember, indexes actually occupy memory space and disk storage, as they are just a copy of your data stored and sorted in another way! Besides the data you are processing, the relevant parts of the indexes also have to be read into system memory to be able to use them.

The Database transactions and mutations graph shows the number of database objects that were actually changed by database queries from the application.

For a single database operation that affects more than one object, this graph shows the number of objects actually changed, as measured from inside the database. However, the Number of database queries being executed graph only shows a single database query for the same operation.

The Number of database connections graph shows the number of connections to the PostgreSQL server.

This should go up and down with the usage of the application. The number of connections is limited to 50.

The connections are categorized as follows:

Connection Type Description
active a microflow or client xpath request is using the database right now
idle in transaction the connection is in use by a microflow, but it is currently executing a microflow activity that is not using the database
idle the connection is open and available to quickly allocate to a microflow or xpath request that needs it

The Database node operating system memory graph shows the distribution of operating system memory that is available for this server.

The most important values on this graph are cache and apps.

The cache values show the memory used to hold parts of the database that have been read from disk earlier. It is crucial to the performance of an application that parts of the database data and indexes that are referenced a lot are always available in the working memory of the server, in the cache. A lack of disk cache on a busy application will result in continuous re-reads of data from disk, which takes several orders of magnitude more time, slowing down the entire application. This may indicate that you have a large number of concurrent database connections from your app and that the environment is not large enough to support these.

The apps values show the amount of memory allocated to the database server (postgresql) to perform database queries.

The Database node CPU usage graph shows the amount of CPU usage in percentage, broken down into different types of CPU usage. Each CPU is counted as 100%, so in a multi-CPU system, the scale will be several hundred percent.

The most important values in here are: user, which shows the CPU time used for running database queries, and iowait, showing the length of time a CPU core is idle and waiting for disk operations to finish (for example, waiting for information that has to be read from disk, or waiting for a synchronous write operation to finish).

Clearly visible amounts of iowait, in combination with a high number of disk read operations (Database Node Disk I/Os), and having all free system memory filled as disk cache (Database Node Operating System Memory), are a sign of a lack of available server memory for use as disk cache. This situation will slow down database operations tremendously, because getting data from disk over and over again takes considerably longer than having it present in memory.

The Database node disk throughput graph shows the amount of data that is being read from and written to disk.

If you see large values here which do not immediately drop back again, it may indicate that your app is continually swapping data to disk. This could be caused by inefficient queries, for example ones which require sorting within the app.

The Database node disk usage (in bytes) graph displays the absolute amount of data that is stored on disk.

The Database node disk usage (percentage) graph shows the displays the relative amounts of data that are stored on disk.

This graph should be interpreted in combination with other graphs. See Combining Information, above.

The Database node disk IO/s graph shows the number of disk read and write operations that are done from and to the disk storage. It does not show the amount of data that was transferred.

This value is commonly used as a general indication for overall server load that can be monitored and alerted upon.

The Database node load value is a composite value, calculated from a range of other measurements, as shown in the other graphs on this page. When actually investigating high server load, this graph alone is not sufficient.

The Database node disk latency graph shows the average waiting times for disk operations to complete.

Interpreting the values in this graph should be done in combination with the other disk stats graphs, together with the type of requests that were made. Sequential or random reads and writes can create a different burden for disk storage.

The Database node disk utilization graph shows the percentage of time that the disk storage is busy processing requests.

This graph should be interpreted in combination with other graphs. See Combining Information, above.

6 Read More