Regular Expressions

Last modified: February 13, 2024

1 Introduction

A regular expression resource document is used in the validation rules of an entity to describe a set of criteria that a string must match.

A regular expression has the properties described below.

2 Common

2.1 Name

The name can be used to refer to the regular expression from a validation rule of an entity.

2.2 Documentation

This is for documentation purpose only; it is not visible in the end-user application that you are modeling.

3 Expression

The expression defines the criteria that a string should be checked against in a formal, internationally standardized regular expression language.

The following sections give a summary of regular expressions that can be used in Mendix. This description also applies to regular expression strings used in functions such as isMatch().

3.1 Subexpressions

A regular expression consists of a sequence of subexpressions. A string matches a regular expression if all parts of the string match these subexpressions in the same order.

A regular expression can contain the following types of subexpressions:

  • [ ] – a bracket expression matches a single character that is indicated within the brackets, for example:

    • [abc] matches “a”, “b”, or “c
    • [a-z] specifies a range which matches any lowercase letter from “a” to “z
  • [^ ] – matches a single character that is NOT contained within the brackets, for example:

    • [^abc] matches any character other than “a”, “b”, or “c”
    • [^a-z] matches any single character that is not a lowercase letter from “a” to “z”
  • {m,n} – matches the preceding element at least m and not more than n times, for example:

    • a{3,5} matches only “aaa”, “aaaa”, and “aaaaa
  • {n} – matches the preceding element exactly n times, for example:

    • [1-9][0-9]{3} ?[A-Za-z]{2} is an alternative way to write the expression for checking the Dutch post code in the example above
  • . – a dot matches any single character; if you want to match a dot, you can escape it by prefixing it with a \ (backslash)

  • A literal character – this is a character that does not have a special meaning in the regular expression language and it matches itself; this is effectively any character except \[](){}^-$?*+|., for example:

    • The space in the Dutch post code example is a literal character that just matches itself
  • \w – a word: a letter, digit, or underscore; \w is an abbreviation for [A-Za-z0-9_]

  • \d – a digit" an abbreviation for [0-9]

3.2 Quantifiers

The number of times that a subexpression may occur in a string is indicated by a quantifier after the subexpression. If no quantifier is present, the subexpression must occur exactly once.

The following quantifiers can be used:

Quantifier Description
? The preceding sub-expression should occur not or once.
* The preceding sub-expression occurs any number of times.
+ The preceding sub-expression should occur once or more.
No quantifier means that the preceding sub-expression should occur exactly once.

4 Read More