Textual pattern matching
Overview
The projects in the repository folder Samples\Basic\Textual pattern matching
demonstrate techniques for:
Parsing addresses
Data Management has built-in tools for parsing addresses into their constituent parts. However, they are intended for high-performance matching applications where multiple address candidates are desirable. In cases where you simply want the most accurate address parsing possible without the limitations of USPS address standardization, Data Management's pattern tools can produce superior results. The sample project parse address with patterns shows how.
Parsing URLs and catalogs
The sample projects parse catalog and parse URLs demonstrate the four steps of parsing free-form text fields into structured data:
Split text into Tokens.
Analyze Tokens for patterns and assign a Symbol to each Token.
Pattern-match the string of Symbols and assign a Class to each Token.
Reassemble Tokens into output fields.