Excel Importer

Last modified: November 19, 2024

Introduction

The Excel Importer module enables you to import Excel data sheets into your Mendix application. You can configure a mapping between your sheet and domain model and automatically create or update your objects.

Typical Use Cases

The typical usage scenario is importing Excel files generated by other applications into Mendix.

Features

  • Create mappings between Excel columns and object attributes
  • Create and set object references
  • Import Excel sheets according to a mapping
  • Support for custom parsing of column data (for example, adding or removing prefixes)
  • Support for non-persistable objects
  • Support reading data from macro-enabled Excel files (XLSM files)
  • Object events are executed when importing
  • The template import/export functionality now includes the field’s type and mask

Limitations

  • Currently, we only support importing Microsoft Excel-based Excel sheets and not other variants like Google sheets, Oracle SQL Developer export sheets, etc.
  • Importing Excel dates as Mendix DateTime attributes may throw an exception based on the way Excel stores dates – most date formats should work, but some date notations are not supported
  • Synchronizing objects with a “self-reference” can create duplicate records

Prerequisites

  • Install the Mx Model Reflection module in order to configure the mapping between the Excel columns and the domain model
  • Install the Community Commons Function Library module to ensure BigDecimal values can be imported
  • Remove older versions of replication.jar file (supplied in the .mpk file) before installing this module
    • If you have an old replication.jar file and you were using the Excel Importer, update the Database Replication module as well, because you need version 3.0 or higher when using this module
    • After updating, you do not need to remove the library

Configuration

After importing the module, connect the Templates_Overview or ExcelImporter.ExcelImportOverview page to your application. This is the starting place for defining Excel mappings. Create a new template from scratch or based on an Excel file (which creates all the columns for you). Create your own Excel upload function and your own import microflow, which should call the Java import start action in the module.

To turn on external entity processing, add the /ExcelImporter//Export/Import Configuration//EnableExternalEntities constant to your configuration and set it to true. This feature is turned off by default for security reasons.

To turn on the parsing of empty cells when a parse with microflow is selected, add ExcelImporter.ParseEmptyCells to the configuration constants and set the value to true. This flag is disabled by default for compatibility reasons.

Template Properties

The Excel Importer module has many different properties. The sections below present details on all the properties that can be selected and what their impact is on the actual import.

General Import Properties

These properties control the global behavior of the import.

Autonumber, Name and Description

These fields are intended to make it easier for you to see what the import does. There are no special restrictions or validations for these fields.

Sheet Nr

Specify the sheet number here from your Excel workbook that you want to import. This is a required field and should contain an Integer. The sheet numbers start counting at 1.

Header Row Nr

This is the row number of the row with the column titles. If you do not have any column titles, just enter any row number that contains the values you want to see in the column objects. The row numbers correspond with the row numbers in Excel.

Import from Row Nr

This is the row number of the first row with actual data that should be imported. The row number corresponds with the row numbers in Excel.

Mendix Object

Each row in the Excel sheet should correspond to a Mendix object. Here you can select the type of object to which you would like to map each row.

Reference to Import Objects

Sometimes you want each imported object to have a reference to one existing object. For example, you want each object to reference the FileObject from which it was created. Set this field to the reference you want to fill this way. When you call the import Java action, the selected reference will be set with the object that you can provide as a parameter in the action.

Import Action

You can specify how you want to import your objects in this property.

The following options are available:

  • Synchronize objects – updates objects that already exist with the values in the Excel sheet; objects that cannot be found are created, and references to other objects are treated the same way
  • Synchronize only existing objects – updates objects that already exist, but does not create any new objects
  • Create an object for each row – creates a new object for each row in the sheet; it does not check whether objects already exist.
  • Only create new objects – searches for any objects with the defined key; if those objects are not found, it will create a new object; if an object is found, it will be skipped

Tab Page Columns

You can specify how to map each Excel column to your Mendix domain model. For each column, you can set the following properties.

Data Source

This property specifies what is going to be processed, which is the value of a specific cell of an actual document property. Which column and how are configured in the additional properties.

The following options are available:

  • Cell value – processes the value of a specific cell
  • Document property – row number – uses the row number (this value is set in each instance that is created)
  • Document property – sheet number – uses the sheet number (this value is set in each instance that is created)

Number and Caption

These fields show what Excel column the mapping is for. The number corresponds to an Excel column and starts at 0. You can use the caption to describe what the column contains. This caption does not need to correspond to the header in the sheet.

Type

Use this field to indicate what to map the column to. You can choose nothing, an object attribute, or an object reference. Depending on your choice, different subsequent options will be available.

Key

This field determines whether the attribute can uniquely identify its object.

The following options are available:

  • No – the attribute is imported without any special handling
  • Yes” or “Yes, only for the main object – the field is a reference, but it is a key value for the main object in the import mapping
  • Yes, only for the associated object – the field is also a key for the reference
  • Yes, for both – the field is a key for both the mapped object and the referenced object

Case Sensitive

This determines whether values in this column should be interpreted as case sensitive. Keep in mind this is only possible if the database you are using is case sensitive as well. For example, if you are using MS SQL, you cannot use case sensitivity, as Microsoft does not search case-sensitive by default.

Parse With

You can use a custom microflow to parse data for a column. Such a microflow should have one parameter of the same type as the column in the Excel. This will receive the data field from Excel. The microflow should return the value to put in the field you want to map (meaning, a DateTime if you are mapping to a DateTime attribute). This does not apply to numbers (Integer, Long, Decimal), because Excel reads the raw data as Decimal and hence the Parsing microflow should always return Decimal. The eventual conversion of this returned Decimal value into either Integer, Long or Decimal will be done by the module.

This field also has quicksearch functionality. Type in the box to the left of the reference selector and press Tab to search for the value you typed.

By default, the selected microflow is not applied to empty cells. The parsing of empty cells can be enabled by setting the ExcelImporter.ParseEmptyCells constant to true.

Attribute

This field is the name of the attribute to which you map the column.

This field has quicksearch functionality. Type in the box to the left of the reference selector and press Tab to search for the value you typed.

Reference and Object Type

These fields are used to configure referenced objects. You can specify what reference should be followed and what type of object is used.

Both fields have a quicksearch box.

Tab Page References

For each reference that is used in the mapping, you can configure what should happen if a referenced object is not found.

The following options are available:

  • Create if not found – if the referenced object is not found, creates a new one with the values found in the database
  • Ignore if not found – if the referenced object is not found, the reference will not be set and no new object will be created
  • Create an object for each row – creates a new object for each row in the sheet; does not check whether objects already exist; references between objects are not set
  • Only create new objects – searches for any objects with the defined key; if those objects are not found, it creates a new object; if the object is found, it is skipped

Data Handling

Like in Studio Pro, the value of a reference (set) can be added to the current values or be overwritten. This property specifies which of those actions is used.

  • Overwrite – always uses the contents of the sheet to overwrite the current reference value; in the case of an empty cell, the reference will be set to empty as well
  • Append – add the new found associated objects to the reference (set); in the case of an empty cell, the reference remains unchanged

Keep track of all the object keys in the association that could not be found.

Commit Unchanged Objects

Even if there are not any changes to the object, this property still commits the objects in order to execute the events.

Tab Page Settings

The following settings control the behavior of the mapping as a whole.

These control which kinds of messages are printed on the application console.

Remove Unsynced Objects

This option causes objects that are not found in the imported database to be removed from your Mendix database. In other words, objects that are in your Mendix database that do not correspond to any imported database rows (anymore) are deleted. You can use this option if, for example, you want to mirror any changes to a target database and not just import new data.

This functionality requires your Mendix objects to have an Integer attribute that is used to keep track of which objects to remove. You can specify which attribute to use for this in the selector marked Attribute.

If this option is used, objects created by your application (meaning, not imported) will be removed on the next import.

Ignore Objects with Empty Keys

This means empty values for key fields are not considered valid.

Clearing this property allows objects with empty key values to exist.

Reset Empty Associations

If there is an empty value found for the associations, this property determines if the reference should be reset.

Commit Unchanged Objects

Even if there are not any changes to the object, you can still commit the objects in order to execute the events via this property.

Frequently Asked Questions

A Column in Excel Shows Only Integers, but When I Import These Fields to a String Attribute, Why Do All the Values Have 2 Decimals?

Excel stores all the numeric values as floats. The library reads all the values from the column and reads the values that are stored (with the two Decimals) and not the values that are shown (without Decimals).

You can fix this by creating a microflow with a string input and a create variable activity. The expression should be something like this: substring( $InputVariable, 0, find($InputVariable, '.')).

Why Are None of the Referenced Objects I Want to Import Created?

You should check the second tab page in your template. The most likely reason for this problem is that you forgot to configure the reference handling. Check if the reference is set to Create if not found.

How Can I Map an Attribute That Is Accessible Through Multiple References (for example, Person > Account > Group.name)?

This is not currently possible. If you want to do this, first create a separate mapping for the Excel file that just imports the “deeper” objects (for example, “Account” and “Group”). Then you can refer to existing objects in the database using key values.

How Do I Map Reference Sets?

This depends on how they are represented in you Excel file. If each element of your reference set has its own row in your sheet, the straightforward mapping should do the right thing. Set the mapping type of the reference columns to “reference.” Set the key type of the columns that form the key values to “Yes, only for the associated object." The mapping will now automatically make sure no duplicate objects are created.

This is an example of this situation for (User N – 0 Group):

Person (Key for Person Object) Group (Key for Group Reference)
John Users
John Admin
John Manager
Jack Users
Jack Manager

If the reference set is represented by multiple column groups (object entries) on each Excel row, you can select the above mapping for each of the column groups. So, you should have multiple columns mapped to the same object reference and attributes. In this case, you should make sure that the key attributes are in the first columns for each reference; otherwise, the import will not be correct.

This is an example of this situation for (User N – 0 Group):

Person Group1 (Key for Group) Group2 (Key for Group) Group3 (Key for Group)
John Users Admin Manager
Jack Users Manager

I Have Some Data at the Bottom of My Sheet That Is a Summary, and the Importer Does Not Accept This. Can I Tell It to Stop at a Certain Row?

You cannot currently do this, because the number of valid data rows in a sheet cannot be determined in advance. Please make sure your Excel sheet does not contain such rows in order to keep the import from failing.

How Can I Map Inheritance, for Example, to Create a Different Sub-Type of a Generalization Based on a Value in a Column?

The Excel importer cannot do this automatically. To do this, map your sheet to temporary objects containing the fields you need to determine the sub-type. Then, use a microflow to convert each of these to the proper type.

Why Does Mapping a Date to a MendixDateTime Attribute Sometimes Change the Value or Throw an Exception?

This is because Excel had an historical workaround to maintain compatibility with Excel on Mac computers. Excel stores the dates as floats, counting the days either from 1-1-1900 or (when using compatibility) from 1-1-1904. This is a difference of 1462 days. For now, only the dates stored using the 1900 format are supported. Excel cannot actually store dates before 1900, so it could happen that those dates throw an exception. If you find an Excel sheet with any of these cases, please contact Mendix Support, as we might be able to fix this problem.

My Import Does Not Import My Column Value, or an Exception Is Thrown While Processing the Sheet. How Can I Locate the Problem?

There are several log nodes available in the Excel Importer and Database Replication modules. These nodes support several levels (info, debug, trace). When you want to find out where the problem is, change the log levels to debug or trace. This way, you will see all the actions the Excel Importer performs on the objects.

The Excel Importer module uses the following log nodes:

  • ExcelImporter
  • ValueParser
  • Replication_MetaInfo
  • Replication_MetaInfo_AssociatedObjects
  • Replication_MetaInfo_MainObject

When Importing Larger Excel Files, the JVM Crashes (java.lang.OutOfMemoryError: Java heap space). How Can I Work Around This Problem?

Generally, the OutOfMemory Error indicates that the system was unable to create new objects on the heap. While processing large Excel files, you may encounter this error and eventually a JVM crash. To work around this issue, take either of the following actions:

  • Slice the input data into smaller, more manageable chunks and import those files individually.
  • If slicing the data is not an option, increase your container size (and resources).

Our internal tests indicate that JVM’s memory footprint peaks at 3 GB while importing an Excel file of ~9MB; our test file had ~25,000 rows, and each row combined 40 columns in a single entity.

The example above reflects JVM’s memory (RAM), not your Mendix container’s memory. Furthermore, the memory requirements will increase if associations are also getting processed as part of an Excel import, so the numbers above should be viewed as a basic guideline of the upper limit. Mendix recommends leaving a suitable buffer of extra memory while sizing your container.

After Upgrading to the Latest Version of Excel Importer, userlib Contains Two Files of the same JAR. How Should I Resolve This?

It is possible that you may see, for example, commons-io.commons-io-2.11.0.jar and commons-io-2.11.0.jar in the userlib folder after upgrading. This is because the naming convention of the JARs is aligned to just have the file name. In such cases, you can safely delete the JAR which uses the package prefix (in this example’s case, delete commons-io.commons-io-2.11.0.jar).

Read More