Data Management's pattern tools are textual parsing tools designed to identify structured data within free-form text fields from sources like:
-
Catalogs
-
Inventories
-
Contact records with combined name/address/phone fields
You can perform simple pattern-matching using the Regex tool.
A more structured pattern-matching process might look something like this:
-
The Token Creation tool parses text input into components or tokens.
-
The Symbol Creation tool compares the tokens to a set of regular expressions and assigns descriptive symbols to the tokens.
-
The Pattern Match tool analyzes groups of symbols and re-classifies each token according to its position within a matched pattern, defined using Data Management's Symbolic Regular Expression language. The Pattern Match tool includes a pattern analyzer and Symbol Sets to help you classify symbols sequences correctly.
-
The Pattern Assembler tool reassembles the parsed data into the desired structured form.
Because the pattern tools group the tokens by record, it is assumed that the incoming records have a unique ID field, which can be used by downstream tools to reassemble records. If your data does not have a unique ID field, we recommend that you create one using the Generate Sequence tool for the subsequent pattern tools to use.
In this section:
-
AO Business Keyword Parser -
AO Business Standardize -
AO Classifier -
AO Email Parse -
AO Field Analyzer -
AO Field Validate -
AO Multi Line Classifier -
AO Multi Line Name Parser -
AO Name Genderize -
AO Phone Parse -
AO SSN Validate -
International Phone -
Name Parse -
Token Creation -
Symbol Creation -
Pattern Match -
Pattern Assembler -
Regex -
Regex Match Table