Skip to main content
Skip table of contents

Expressions

Overview

Expressions are equations that analyze data from one or more fields. Expressions can perform calculations, manipulate text, or test data. The resulting value can then be assigned to a field, used in another expression, or used to direct the flow of data through the Data Management project.

Data Management has a rich expression language and a set of useful functions that makes it easy for you to manipulate your data. Expressions are used in the Calculate tool to assign new values to fields, and in the Filter tool to evaluate which records to keep.

The expression editor's autocomplete feature displays a pop-up list of fields, variables, and functions as you type.

If you calculate or assign a field value that does not match the field's defined type, the result will be an error value. An expression producing an invalid result (such as a number too large to fit in the field) will also appear as an error value.

An error value viewed in the Data Viewer tool's output looks like <<Error>>. When an error value is converted to an output file or database it is changed to a Null or blank value. (Most data formats do not have an explicit Error representation.)

Use the function IsValid to check that values you want to calculate with are valid values. You can also replace invalid values with a default value using a conditional expression.

For example: IF IsValid(FIELD) THEN FIELD ELSE 0 ENDIF

Functions like Trim strip off blanks, sometimes making the resulting value shorter than the field to which it is assigned. The length of text expressions varies depends on the expression and the values. However, once you assign an expression result to a fixed-length field, the value is truncated to the field size and padded with blanks.

Fields used in functions must be of the correct type. For example, a text field cannot be directly passed to a function like Abs which expects a numeric value. However, you can use a conversion function to force the correct type: Abs(AsInteger(LONGITUDE))

You can also use the Calculate tool to create a new field of the desired type, assigning its value from the old field, thereby changing the type of the value.

In the absence of explicit type information, Data Management’s expression language attempts to infer a token’s type. When it encounters a numeric value containing a decimal point, it concludes that value is a double suitable for floating-point arithmetic. That might yield rounding errors in the number’s least significant digit. If fixed-point precision is required (such as for currency calculations), first express the value as a string (e.g. 9999999999999.0). You can assign that string directly to a Decimal field or use it in an expression after passing it through the function AsDecimal.

Syntax overview

Expressions calculate values in a specific order known as the syntax. The syntax of the expression describes the process of the calculation. It includes the elements to be calculated—the operands—and the calculation operators. Each operand can be a value that does not change (a constant value) or an identifier (a field reference, label, or name). Operators can perform arithmetic, comparison, textual, and logical operations. You can also use predefined expressions (called functions) to perform a variety of useful calculations and transformation. Additionally, the Data Management expression language supports IF...THEN...ELSE statements, which you can use to create conditional expressions.

You create an expression by combining operators and operands to produce the result you want. For example, the following expression concatenates three fields into a single field, separating the name components with spaces:

First_Name + " " + Middle_Name + " " + Last_Name

Expressions can be used to perform more complex data transformations. The expression below translates ranges of a numeric value into names representing those ranges:

CODE
IF X >= 100 AND X <= 1000 THEN
  "SMALL"
 ELSEIF X >= 1001 AND X <= 5000 THEN
  "MEDIUM"
 ELSEIF X >= 5001 AND X <= 10000 THEN
  "LARGE"
 ELSE
 "UNKNOWN"
ENDIF

Expressions can also be used to select or filter records. The following expression chooses all records with dates falling on the second Tuesday of any month:

Day(DATE) >= 8 AND Day(DATE) <= 14 AND DayOfWeek(DATE) = 3

Syntax basics

Data Management's expression language consists of many different elements, each with a simple set of rules that you must follow to write correct expressions. This set of rules is known as the syntax of the expression language.

Basic elements

The first level of syntax consists of elements that you are probably already familiar with—strings, numbers, and fields:

Element

Description

Examples

String

A textual string surrounded by double-quotes.

Certain special character sequences are interpreted, even when embedded in double-quotes:

\r carriage return
\n line feed
\t tab
\\ single backslash "\"

"Data Management"

Number

A series of digits with an optional sign, decimal point, and exponent

3.14159

–100.00

1.2345e+10

Field

The name of the field. The field name must be surrounded by [ ] if it contains spaces or punctuation, or does not start with a letter. The value in an expression comes from the field in the current record.

ADDRESS

[FIRST NAME]

Functions

Data Management has a set of built-in functions that are used to perform common tasks. They include string manipulation, pattern-matching, mathematics, and date/time processing. To invoke a function, you type the name of the function, followed by a list of arguments enclosed in parentheses. In general you call a function using a syntax likeFunctionName(argument1, argument2, ...) where the number and type of arguments is determined by the function being called. Every function performs its specific computation on its arguments and returns a result value. For example, in: Substring(NAME, 1, 4) the function Substring requires that the first argument is text, and the second and third arguments are numbers. It extracts a substring from the first argument and returns it.

You can think of function calls as a placeholder for the results—after the function completes its work, the function call is "replaced" in the expression by its result, and the expression evaluation continues using that result in the surrounding expression. For example, if the value of the NAME field is "John Doe" and you use the expression Substring(NAME, 1, 4) + "X" the expression evaluator will first process the Substring function and return "John." It will then evaluate "John" + "X" and yield "JohnX."

You can use functions as the arguments to other functions. For example: IsLike( Substring(NAME, 1, 2), "%X") first extracts the first two characters from NAME, and then checks to see if that two-character string ends with an "X."

Operators

Data Management supports the arithmetic, string, and comparison operators that you are familiar with:

Operators

Meaning

Examples

+ – * / %

Arithmetic operations on numeric values

3 + 4

1000 * SALARY

+

Concatenation of text

"John" + NAME

AND, OR

Logical disjunction and conjunction

SALARY > 50000 AND SALARY < 100000

= !=

Equal or not-equal

NAME != "John"

< <= >

Relational comparison

NAME < "M"

See the operators reference for details of usage and the types that can be used.

IF/THEN/ELSE/ENDIF

Data Management supports conditional expressions using IF/THEN/ELSE/ENDIF. In Data Management, you use IF/THEN/ELSE/ENDIF as a "value selector." This contrasts with programming languages such as Visual Basic, which use conditionals to direct the flow of execution of a program.

Think of IF/THEN/ELSE/ENDIF as a means to select one of many sub-expressions as values. For example:

CODE
IF INCOME < 10000 THEN "A"
ELSE "B"
ENDIF

This results in one of two values:

  • If INCOME is <100000, use “A”

  • If INCOME is >=100000, use “B”

Some examples of how the INCOME field relates to the result value:

Value of INCOME

Result value

7500

"A"

10000

"B"

2000

"A"

122500

"B"

You can extend IF/THEN/ELSE/ENDIF to select more than two result values, by using ELSEIF. Add an extra ELSEIF clause between the THEN and ELSE for each additional value you want. For example:

CODE
IF INCOME < 10000 THEN "A"
ELSEIF INCOME < 25000 THEN "B
ELSEIF INCOME < 50000 THEN "C"
ELSE "D"
ENDIF

Some examples of how the INCOME field relates to the result value:

Value of INCOME

Result value

40000

"C"

100000

"D"

0

"A"

9999

"A"

Note that the clauses of IF/THEN/ELSEIF/ELSE/ENDIF are evaluated in order, so the result for the first true condition encountered is used, regardless of the rest of the conditions. The ELSE result value is only used if all the conditions are false. For example:

CODE
IF INCOME < 100000 THEN "A"
ELSEIF INCOME < 10000 THEN "B"
ELSE "C"
ENDIF

This will never result in "B", because the first condition is always tested first.

Operators reference

Operators specify the type of calculation that you want to perform on the elements of an expression. Data Management's expression language includes four different types of operators: arithmetic, comparison, text, and logical operators.

Arithmetic operators

Arithmetic operators perform basic mathematical operations such as addition, subtraction, or multiplication; combine numbers; and produce numeric results.

Arithmetic operator

Meaning

Example

+

Addition

2 + 2

-

Subtraction

2 - 1 - 1

-

Negation

-VALUE

*

Multiplication

2 * 2

/

Division

4 / 2

%

Modulus ( integer remainder)

103 % 10

The modulus operator (%) can only be used with two integers (fixed-point numbers with a scale of 0).

Comparison operators

Comparison operators compare two values or expressions and then produce the logical value TRUE or FALSE.

Comparison operator

Meaning

Example

=

Equal to

GENDER = "M"

>

Greater than

AGE > 21

<

Less than

INCOME < 60000

>=

Greater than or equal to

SALES >= 1000

<=

Less than or equal to

EXPENSES <= INCOME

!=

Not equal to

ADDRESS1 != ADDRESS2

Text operators

The operator "+" concatenates one or more text values to produce a single piece of text.

Text operator

Meaning

Example

+

Concatenates two values to produce one continuous text value

John + Henry (produces JohnHenry)

Logical operators

Logical operators (sometimes known as Boolean operators) operate on Boolean-valued expressions to form a new Boolean-valued expression. The AND and OR operators combine two expressions, whereas the NOT operator works on a single expression.

Logical operator

Meaning

Example

AND

Returns TRUE if left-hand and right-hand expression are both TRUE, otherwise returns FALSE

GENDER = "M" AND AGE >= 40

OR

Returns TRUE if left-hand expression is TRUE or right-hand expression is TRUE. Returns FALSE if both are FALSE.

NAME = ACME OR NAME = IBM

NOT

Returns the opposite of the following expression. Note use of parentheses; NOT is higher precedence than AND.

NOT (GENDER = "F" AND AGE > 1)

Precedence

When more than one operation occurs in an expression, each operation is performed in a predetermined order called operator precedence. Data Management evaluates expression operations in the following order:

  1. Constant values, identifiers, functions, expressions in parentheses, and IF...THEN...ELSE statements

  2. Unary minus (–)

  3. Logical inversion operations (!, NOT)

  4. Multiplication, division, and modulus operations (*, /, %)

  5. Addition and subtraction operations (+,–)

  6. Comparison operations (=, <, >, <=, >=, !=)

  7. Logical operations (AND, OR)

Examples

Expression

Equivalent to

Remarks

3 + 4 * 5

3 + (4 * 5)

* is higher precedence than +

VALUE + 2 = SIZE

(VALUE + 2) = SIZE

+ is higher precedence than =

SEX = "F" AND AGE > 40

(SEX = "F") AND (AGE > 40)

Females over 40. =,> are higher precedence than AND

NOT SEX = "F" AND AGE > 40

(NOT (SEX = "F")) AND (AGE > 40)

Males over 40. NOT is higher precedence than AND

NOT (SEX = "F" AND AGE > 40)

NOT ((SEX = "F") AND (AGE > 40))

Anyone that is not a female over 40. Explicit parentheses are required to get this behavior.

IF VALUE != 0 THEN VALUE / 10 ELSE Random(10) ENDIF

IF (VALUE != 0) THEN (VALUE / 10) ELSE (Random(10)) ENDIF

IF/THEN/ELSE/ENDIF are low-precedence, so embedded expressions need no parentheses.

If you have any doubt about the precedence-grouping of an expression, or think that someone else might not understand it, use explicit parentheses.

Expression types and validity

The type of an expression's result depends on the kind of expression and the types of the expression's inputs:

Conditional expressions

Conditional expressions produce results of the same type as the dependent arguments of their Then and Else clauses. The Then and Else dependent arguments can be either the same type, or a mixture of Integer and Float types. In the latter case, the result type is Float.

Logical expressions

Logical expressions—those using AND, OR, and NOT operators—produce results of type Boolean.

Relational expressions

Relational expressions—those comparing two values or expressions—result in Boolean values. Relational expressions can compare most types, with the restrictions noted below:

Comparison operator

Argument types

=
!=

Arguments may be any type except Binary.

Both arguments must be of the same type, except that Integers and Floats are comparable.

Elapsed time cannot be compare to other time formats.

Text values of unequal lengths can be compared; all else being equal, shorter strings come before longer ones. "JOH" comes before "JOHN" but "JON" is after "JOHN".

<
>
<=
>=

Same as above, except that arguments cannot be of type Boolean.

Additive expressions

The result of an additive expression depends on the types of its arguments. The following table shows the types that can be combined in an additive expression, which operations are valid, and the resulting type:

Left Type

Right Type

Operation

Resulting Type

Text

Text

+ (add)

Text—the concatenation of the arguments. If the arguments are fixed-length text fields, then any leading or trailing blanks contained in the arguments will also be in the result. Use Trim() to avoid that, or first convert the arguments to variable-length text fields. Note that most text fields have a limit of 100MB.

Integer

Integer

+, - (add, subtract)

Integer capable of holding all values that the Left and Right types could hold.

Integer

Float

+, - (add, subtract)

Float

Float

Integer

+, - (add, subtract)

Float

Date

Date

- (subtract)

Integer indicating days difference. Size of Integer is eight and scale is zero.

Date

Integer, Float

+, - (add, subtract)

Date—the Left date value advanced (or decremented) the given number of days. Fractional days are ignored.

Integer, Float

Date

+ (add)

Date—the Right date value advanced the given number of days. Fractional days are ignored.

Time

Time

- (subtract)

Integer—size of nine and scale of three, indicating thousandths of a second difference between the two times. If one date is Elapsed time format, both must be Elapsed time format.

Time

Integer, Float

+, - (add, subtract)

Time—the Left Time value advanced (or decremented) by the Right value in seconds. Fractional seconds are processed down to thousandths of a second precision. Non-elapsed times will "wrap around" at midnight.

Integer, Float

Time

+ (add)

Time—the Right time value advanced by the Left value in seconds. Fractional seconds are processed down to thousandths of a second precision. Non-elapsed times will "wrap around" at midnight.

DateTime

DateTime

- (subtract)

Integer—size of fifteen and scale of three, indicating thousandths of a second difference between the two date-times.

DateTime

Integer, Float

+, - (add, subtract)

DateTime—the Left DateTime value advanced (or decremented) by the Right value in seconds. Fractional seconds are processed down to thousandths of a second precision.

Integer, Float

DateTime

+ (add)

DateTime—the Right time value advanced by the Left value in seconds. Fractional seconds are processed down to thousandths of a second precision.

Multiplicative expression

The result of a multiplicative expression depends on the types of its arguments. The following table shows the types that can be combined in an multiplicative expression, which operations are valid, and the resulting type:

Left Type

Right Type

Operation

Resulting Type

Integer

Integer

* (multiply)

Integer—size and scale are the sum of the sizes and scales of the arguments, so that any possible result can be stored.

Integer

Integer

/ (divide)

Integer—size equal to the left argument's size plus the right argument's scale. Scale is equal to the left argument's scale.

Float

Integer, Float

* (multiply)

Float

Float

Integer, Float

/ (divide)

Float

Comments

Expressions may contain C++ style comments. The expression text:

CODE
"some text" /*embedded comment*/ + " and  "
// single-line comment
+ "more text " // comment to end of line
+ " and " /* multi-line comment
Follows because I have a lot to say  about this */
+ "even more text"

is equivalent to the expression:

CODE
"some text” + " and  " 
+ "more text "
+ " and " 
+ "even more text"
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.