Skip to main content
Skip table of contents

Pattern Assembler

The Pattern Assembler tool is the final step in textual parsing. Once patterns have been matched in the Pattern Match tool, you may want to reassemble the records in different ways to convert the unstructured data into a structured form. The Pattern Assembler tool allows you to specify multiple tokens within a record group, to be combined into one or more fields. Or, you can distribute tokens to wide record fields (multiple similar fields within a single record).

Pattern Assembler tool parameters

The Pattern Assembler tool has one set of configuration parameters in addition to the standard execution options:

Input group

Field that uniquely identifies each original record, as specified in the upstream Token Creation tool.

Input token

Field containing the tokens output by the Token Creation tool.

Input class

Field containing the classes output by the Pattern Match tool.

Output all available fields

If selected, includes all input fields in output.

Specify token reassembly
Class

Name of class. If no Input class is specified, this must be blank.

Specify token reassembly
Prefix

If defined, text to add to the output field before the first token of this Class.

Specify token reassembly
Always prefix

If selected, always add Prefix. By default, Prefix is only added to non-empty fields.

Specify token reassembly
Separator

Text to insert between tokens of this Class.

Specify token reassembly
Output size

If Output is new, specifies the size.

Specify token reassembly
Output1...OutputN

Base name of the Output field to receive the assembled tokens. If more than one output field is specified, tokens of the class for this row are distributed sequentially to all the named output fields. This is optional and defaults to Output1 through OutputN.

Configure the Pattern Assembler tool

  1. Select the Pattern Assembler tool, and then go to the Configuration tab on the Properties pane.

  2. Select Input group and select the unique ID field you specified in the Pattern Match tool.

  3. Select Input token and select the field containing the tokens output by the Token Creation tool.

  4. Select Input class and select the field containing the Pattern Match tool's Output Class.

  5. Optionally, select Output all fields to send all input fields to output.

  6. Using the Specify token reassembly grid, create a sequence of "assembly instructions." For each row, specify some or all of the following items:

    • Class: The token class to process in this row.

    • Prefix: A string to add to the output field before the first token of this class. This is typically used to separate sections of the output field.

    • Always Prefix: Check this column if the Prefix should always be added. By default, the Prefix is only added when the field already contains something.

    • Separator: A string to place between contiguous tokens of the same class.

    • Output size: If you are creating a new output field, this specifies the size. Since the same output field can be specified many times, the first Output size specified is used.

    • Output1: The output field to create.

    • Output2 through N: Additional output fields. If more than one output field is specified, tokens of the class for this row are distributed sequentially to all the named output fields. This is useful when separating the contents of a field into multiple homogeneous fields (for example, one field per month, or one field per household member).

  7. Optionally, go to the Execution tab, and then set Web service options.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.