Table Lookup
Overview
The Table Lookup tool performs a find-and-extract function, searching for input keys in a lookup table. When a key is found, the corresponding result value is output. This is similar to finding someone's phone number in a phone book. You start with the person's name (the key), and search the phone book for that entry. When you find the entry for that person's name, you extract the phone number (the result). The Table Lookup tool works in much the same way, except that it finds and outputs a result value for every record read from its data input.
The Table Lookup tool is useful for:
Data cleansing by replacing common variants of a field's value. This is often a problem in files that are the result of unvalidated data entry or consolidation of multiple databases. For example, you could replace "grey" with "gray." (see Enforcing consistent field values).
Finding all records with field values that are found in a reference table (see Finding all records matching a table of values).
The Table Lookup tool requires two inputs: a data input (labeled "D" on the tool icon), and a lookup table input (labeled "T" on the tool icon). The data input contains records with key values for which you want the corresponding result value. The lookup table input contains key/result pairs or combinations. Each line or record in the table contains the key value followed by one or more result values.
When the project is run, the Table Lookup tool reads one record at a time from the data input and attempts to find the record's key field in the lookup table. If the key is found, the corresponding result value is output to the designated result field. If the key is not found, the result of evaluating the default expression is output instead.
The Table Lookup tool only accepts a lookup table where all key and result values are smaller than 65535 bytes.
Table Lookup tool configuration parameters
The Table Lookup tool has one set of configuration parameters in addition to the standard execution options.
Parameter | Description |
---|---|
Match type | Specifies the kind of lookup/replace to perform. Can be one of Entire Field Only, At Front Only, At End Only, or Anywhere. This is optional and defaults to Entire Field Only. |
Whole words only | If selected, matches only strings separated by non-alphanumeric characters. |
Case insensitive match | If selected, ignores capitalization in comparisons. |
Optional lookup condition | If selected, you must also specify a Boolean expression. |
Expression | If this Optional lookup condition evaluates to True, a lookup is performed. |
Data input | The input field from the "D" connection containing the key to look up. |
Lookup table source | The lookup table field from the "T" connection containing keys. |
Allow multiple keys | If selected, allows multiple values for each key and creates multiple Result field columns up to the Number of keys specified. |
Number of keys | If Allow multiple keys is selected, you can specify up to 10 keys. |
Result field | The lookup table field from the "T" connection containing values. |
Store result value in | Output field to which the Result field values will be written. You can specify an existing field or enter a new field name. The field must be of a type and size compatible with the Result field. |
If input key is not found | If an expression is specified, the result of evaluating the expression will be written to the Result field if a data input key is not found in the lookup table. |
Configure the Table Lookup tool
Select the Table Lookup tool, and then go to the Configuration tab on the Properties pane.
Select the MATCH TYPE:
Entire Field Only: input key field and Lookup table key field must match exactly.
At Front Only: matches any substring from the beginning of the field.
At End Only: matches any substring from the end of the field.
Anywhere: matches any substring in the field.
If you select At Front Only, At End Only, or Anywhere, specify whether to Replace only matched data or Replace entire field.
Optionally, select the Whole words only box to match only strings separated by non-alphanumeric characters.
Optionally, select the Case insensitive match box to ignore capitalization in comparisons.
Optionally, select the Optional lookup condition box and specify an expression that evaluates to
True
orFalse
. If the condition evaluates asFalse
for a given record, no lookup is performed.In the DATA INPUT section, select a Key field from the list of available input fields. The Key field must be a text field. This specifies which field values Data Management will attempt to find in the lookup table.
In the LOOKUP TABLE SOURCE section, select a Key field and a Result field from the list of available input fields. The Key field must be a text field. However, the Result field can be of any type. This specifies the field Data Management searches for input key values, and the field containing the corresponding result values.
Key values and result values must be <65535 bytes.
Optionally, select Allow multiple keys and specify the maximum number of keys permitted. This creates multiple Result field columns up to the maximum number specified. See the example below, which assumes Key field
KEY
, a setting allowing up to three keys, and Result fieldNICKNAME
.
Lookup table | Result fields |
---|---|
JOHN,JON | KEY,NICKNAME1,NICKNAME2,NICKNAME3 |
JOHN,JONATHAN | JOHN,JON,JONATHAN,JOHNNY |
JOHN,JOHNNY |
|
In the OUTPUT RESULTS section, specify the name of the field that will hold the results in one of the following ways.
Select an existing field from the Store the result value in field list. The result field of the table input must be convertible to the type of the output field.
Enter a new field name. The new field will be created with the same type and size as the lookup table's Result field.
Optionally, enter an expression in the If input key is not found, use this expression for the result value box. If a data input key is not found in the lookup table, the result of evaluating the specified expression will be written to the output results field.
You can use any expression, even one using functions and input field values. For example, suppose you are using the lookup table to find yearly revenue for each of your customers, based on account number. However, your lookup table is incomplete.
The following expression estimates yearly revenue based on the number of employees (which you know from some other source and which is contained in the
NUMBER_OF_EMPL
field):NUMBER_OF_EMPL * 100000
.
Optionally, go to the Execution tab, and then set Web service options.
Create a lookup table for value replacement
To use the Table Lookup tool, you need a reference table containing the lookup values. This table can originate from any data source. It must contain a minimum of two columns: one for the "before" (or key) values, and one for the "after" (or result) values. If you do not have a pre-existing lookup table, you can create one using the following procedure.
Create a two-column spreadsheet using Microsoft Excel or a similar application. In the first column, list all of the non-conforming values you wish to replace. In the second column, list the value you wish to substitute.
Using the Save As command, save the file in CSV (Comma Separated Value) format. The result might look something like this:
EXTRA LARGE,XL
LARGE,L
LG,L
LRG,L
LRGE,L
MED,M
MEDIUM,M
MEIUM,M
SM,S
SMALL,S
Alternatively, you can use a text editor to create the CSV file directly.
Once your lookup table has been created, add a CSV Input tool to your project to read the lookup table, and connect the CSV Input tool to the "T" input of the Table Lookup tool.