Skip to main content
Skip table of contents

AO Multi Line Classifier

Overview

Advanced Object (AO) Multi Line Classifier parses and identifies diverse data from multiple text fields ("lines"). You can select global formatting and parsing options for all lines or tailor them to the expected content of each line. The macro accepts a single stream as input and produces one output stream containing the identified fields.

AO Multi Line Classifier configuration parameters

AO Multi Line Classifier has three sets of configuration parameters in addition to the standard execution options: Configuration, Name options, and All lines/Lines.

AO Multi Line Classifier Configuration tab

Input fields

Parameter

Description

Line 1...8

One or more fields containing diverse data.

  • Default: none

Global options

Parameter

Description

Output name components

Select to parse first found name into components.

  • Default: no

Output phone components

Select to parse found phone number into components.

  • Default: no

Validate/Parse SSN

Select to validate first found Social Security Number and parse it into components.

  • Default: no

Output unparsed

Select to output unparsed field components.

  • Default: no

Use large table

Select to use Data Management's comprehensive parsing look-up table. If you are resource-limited, you should leave this off.

  • Default: no

Classifier data source

You may specify an optional Classifier data source. This is a table in DLD format containing either two or three columns: TOKEN, SYMBOL, and (optionally) GENDER.

Parsing/gender data source

You may specify an optional Parsing/gender data source. This is a table in DLD format containing either two or three columns: TOKEN, SYMBOL, and (optionally) GENDER.

Limit output

Optionally specify the maximum number of each component that you want to limit on output. You can specify 1-16; the default setting of 0 (zero) will output two values of the selected component per input line.

Parameter

Description

Max names

Manually limit the number of name fields output.

  • Default: none

Max firms

Manually limit the number of firm fields output.

  • Default: none

Max SSNs

Manually limit the number of SSN fields output.

  • Default: none

Max emails

Manually limit the number of email fields output.

  • Default: none

Max phones

Manually limit the number of phone fields output.

  • Default: none

Max addresses

Manually limit the number of address fields output.

  • Default: none

Max unparsed

Manually limit the number of extra fields output.

  • Default: none

AO Multi Line Classifier Name options tab

Parsing options

Parameter

Description

Prefer First/Last

For ambiguous two-name cases like "Scott Davis" and "Davis Scott", prefer Last/First interpretation over First/Last interpretation.

  • Default: no

Capitalization

Choose capitalization style of the output.

  • Default: original

Preserve dual LastName

If the name field might contain names with two last names, you can select this option to put both in a single last name field. If you have the name Mary Andrews Smith, selecting this option will write Andrews Smith to the OUT_LNAME1 field. If this option isn't selected, Andrews will be written to the OUT_MIDNAME1 field and Smith will be written to the OUT_LNAME1 field.

  • Default: no

Split hyphenated LastName

If the name field might contain names with hyphenated last names, you can you can select this option to store hyphenated last names in separate fields. If you have a last name of Watson-Jones, selecting this option will write Watson to the OUT_LNAME1 field and Jones to the OUT_LNAME1_2 field. If this option isn't selected, then Watson-Jones will be written to the OUT_LNAME1 field.

  • Default: no

Parse suffix

Select this option to distinguish between generational suffixes such as Jr and III, suffixes such as DR and PhD and professional titles such as Finance Manager. An input name of the form "James Smith III, MD" will be output with "III" in the OUT_POSTNAME1 field and "MD" in the OUT_SUFFIX1 field. The name "Janice Jones, PhD, VP of Development" will be output with "PhD" in the OUT_SUFFIX1 field and "VP of Development" in the OUT_PROFTITLE1 field. Without this option checked, "PhD" and "VP of Development" would both go to the OUT_SUFFIX1 field.

  • Default: yes

No punctuation in titles

Select to remove the punctuation from honorary titles. This will strip the periods in titles like M.D.

  • Default: no

Initials at name end as suffix

Select this option to extract name suffixes expressed as initials from names.

  • Default: no

Genderize name before suffix

By default, the Name Parse macro assigns gender by analyzing data in this order: Prefix, Suffix, First Name, Middle Name. Select this option to change the order to Prefix, First Name, Middle Name, Suffix.

  • Default: no

Treat "President" as

You can treat the word "President" as either Title or Prefix. If you select Title, then "President" will be put in the OUT_PROFTITLE1 field.

  • Default: title

Treat "C O" as "C/O"

Select to interpret the string "C O" as "Care Of".

  • Default: yes

AO Multi Line Classifier All lines/Lines tabs

All lines tab and Lines 1...8 tabs

If the Apply to all fields option is selected, Data Management will check every input field for the selected components. If you clear the Apply to all fields option, you can use the Lines 1...8 tabs to define different parsing configurations for each input line, for up to eight lines.

Parameter

Description

Apply to all fields

Select this to apply the configuration defined on the All lines tab to all input fields.

  • Default: yes

Check for

Parameter

Description

Name

Check data for personal names.

  • Default: no

Check for Firm

Check data for company names.

  • Default: no

Check for DBA

Check data for DBA (Doing Business As) names.

  • Default: no

Check for Address

Check data and for address line.

  • Default: no

Check for Email

Check data for email Addresses.

  • Default: no

Check for Phone

Check data for phone numbers (North American data only).

  • Default: no

SSN

Check data for Social Security Numbers.

  • Default: no

Default if unclassified

Parameter

Description

Letters only

Specifies how to categorize data that cannot be otherwise classified. Data cannot contain numbers. Options are:

  • Name

  • Firm

  • Address

The default is Blank.

Letters and digits

Specifies how to categorize data that cannot be otherwise classified. Data may contain numbers. Options are:

  • Firm

  • Address

The default is Blank.

Compound (and)

Specifies how to categorize data that cannot be otherwise classified. Data may contain AND, &, OR. Options are:

  • Name

  • Firm

The default is Blank.

Configure AO Multi Line Classifier

  1. Select AO Multi Line Classifier.

  2. Go to the Configuration tab on the Properties pane.

  3. Select Line 1 and choose the input field. Repeat for any additional input fields.

  4. Review Global options and select desired output options. If you are not resource-limited, you may optionally select Use large table.

  5. Optionally, you may specify one or more additional parsing data sources. See Adding a classifier data source and adding a parsing/gender data source.

  6. Optionally, select items in the Limit output section to set the maximum number of each component type on output.

  7. Select the Name options tab to configure name parsing options.

  8. Select the All lines tab.

    • To apply the same parsing configuration to all input fields, check Apply to all input fields, and then specify the desired parsing configuration.

    • To specify different parsing configurations for each input field, clear Apply to all input fields, and then select the Lines 1-2 tab. Specify the desired parsing configuration for each field, and repeat on other Lines tabs for additional lines.

  9. Optionally, go to the Execution tab, and then set Web service options.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.