Symbol Creation
Overview
Depending on your input data, you can use the Symbol Creation tool, the Table Lookup tool, or both tools as the second step of textual parsing. Like the Table Lookup tool, the Symbol Creation tool associates a value (or symbol) with each input key (or token). However, the tools differ in the way they perform their matching:
The Table Lookup tool searches for input keys in a lookup table, and for keys that are found, outputs the corresponding result value. It's good for parsing input with a known set of values, such as city and state names or catalog items.
The Symbol Creation tool compares an input field (usually the token field generated with the Token Creation tool) to one or more patterns (regular expressions) and assigns a descriptive symbol to the resulting field values. Symbols identify the kind of token (for example, NUMBER or WORD). The Symbol Creation tool is good for parsing input with determinate patterns, such as phone numbers, Social Security numbers, and email addresses.
For example, the following patterns will classify phone numbers and Social Security numbers.
Pattern | Symbol |
---|---|
(' d{3} ')' d{3} '-' d{4} | PHONE |
d{3] '-' d{2} '-' d{4} | SSN |
The above patterns only work if you don't break tokens on '(' ')' or '-'. If you break tokens on those characters, you will have to recognize these items using higher-level patterns in the Pattern Match tool.
Symbol Creation tool configuration parameters
The Symbol Creation tool has one set of configuration parameters in addition to the standard execution options.
Parameter | Description |
---|---|
Input token | Field containing the tokens output by the Token Creation tool. |
Output symbol | If specified, text field in the output record that will receive symbols. This is optional and defaults to new field SYMBOL. |
Default symbol | If specified, symbol assigned to tokens that do not match any pattern. |
Case insensitive match | If selected, capitalization is ignored in pattern comparisons. |
Specify patterns | Patterns (regular expressions) and their associated Symbols. |
Configure the Symbol Creation tool
Select the Symbol Creation tool.
Go to the Configuration tab on the Properties pane.
Select Input token and choose the field containing the tokens output by the Token Creation tool.
Optionally, select Output symbol and specify an existing or new field to receive the generated symbol values. By default, a new field named SYMBOL receives symbol values.
Optionally, enter a default descriptive text value or symbol in the Default symbol box. This symbol will be assigned to tokens that do not match any pattern.
Optionally, select the Case insensitive match box to ignore case in pattern comparisons.
Select the Specify patterns grid and enter Patterns (regular expressions) and their associated Symbols. Patterns are evaluated in the order they appear in the grid. Select and to adjust the order, or to delete a pattern.
Optionally, go to the Execution tab, and then set Web service options.