Skip to main content
Skip table of contents

AO Classifier

Overview

Advanced Object (AO) Classifier extracts undefined data of one or more types from a single field. You can select the type(s) of data to be found in a given field. AO Classifier accepts a single stream as input and produces one output stream containing the identified fields.

Add a classifier data source

You can customize the operation of AO Classifier and AO Multi Line Classifier by defining an additional parsing data source. Review Data Management's parsing technology, built-in data dictionary, and a sample parsing data source before creating your own supplemental data source.

The supplemental data source must be a Data Management DLD file with three columns:

  • TOKEN: the text of the token extracted from the name field.

  • SYMBOL: the part of the name that the token represents.

  • GENDER: if the token is gender specific, GENDER is M or F, otherwise blank.

SYMBOL is one (or a combination) of the following symbols:

  • FN: first name (Aelena, Evo)

  • FP: first name prefix (Cpt, Sir)

  • LN: last name (Behlin, Looney)

  • LP: last name prefix (Mc, Vander)

  • LS: last name suffix (III, Jr)

  • LT: last name title (CEO, Trust)

  • FI: firm indicator (Company, Corporation, Incorporated)

  • WD: word (suppress word if it appears in Data Management parsing dictionaries)

Because TOKENS can be ambiguous, some SYMBOLs can be "overloaded" to indicate multiple possibilities. These compound symbols indicate that a token can be any one of the referenced name parts. Thus SYMBOL FNLNLP indicates a TOKEN that may be any of first name or last name or last name prefix (for example, Della or Santa).

The SYMBOLs FI (firm indicator) and WD (word) must be used singularly.

The compound symbols recognized by the macro are:

  • FNFP

  • FNFPLN

  • FNFPLNLT

  • FNFPLT

  • FNLN

  • FNLNLP

  • FNLNLS

  • FNLNLT

  • FNLP

  • FNLS

  • FNLT

  • FPLN

  • FPLNLT

  • FPLS

  • FPLT

  • LNLP

  • LNLS

  • LNLT

  • LPLS

AO Classifier configuration parameters

AO Classifier has two sets of configuration parameters in addition to the standard execution options: Fields, and Options.

AO Classifier Fields tab

Select input field

Parameter

Description

Input field

Input field to identify the contents of the data.

  • Default: none

Select output fields

Parameter

Description

Name

Select to output full name field.

  • Default: no

Name1 components

Select to parse first found name into components.

  • Default: no

Name2 components

Select to parse second found name into components.

  • Default: no

Firm

Select to output company name field.

  • Default: no

DBA

Select to output any DBA (Doing Business As) names.

  • Default: no

Address

Select to output address line.

  • Default: no

Email1

Select to output first found email address.

  • Default: no

Email2

Select to output second found email address.

  • Default: no

Phone1

Select to output first found phone number.

  • Default: no

Phone1 components

Select to parse and validate first found phone number into components.

  • Default: no

Phone2

Select to output second found phone number.

  • Default: no

Phone2 component

Select to parse and validate second found phone number into components.

  • Default: no

SSN1

Select to output first found Social Security Number.

  • Default: no

SSN1 components/validation

Select to parse first found Social Security Number into components: Area, Group, Sequence. Also outputs a Valid SSN Flag.

  • Default: no

SSN2

Select to output second found Social Security Number.

  • Default: no

SSN1 components/validation

Select to parse second found Social Security Number into components: Area, Group, Sequence. Also outputs a Valid SSN Flag.

  • Default: no

Other

Select to output unidentified data.

  • Default: no

AO Classifier Options tab

Default if unclassified

Parameter

Description

Letters only

Specifies how to categorize data that cannot be otherwise classified. Data cannot contain numbers. Options are:

  • Name

  • Firm

  • Address

The default is Blank.

Letters and digits

Specifies how to categorize data that cannot be otherwise classified. Data may contain numbers. Options are:

  • Firm

  • Address

The default is Blank.

Compound (and)

Specifies how to categorize data that cannot be otherwise classified. Data may contain AND, &, OR. Options are:

  • Name

  • Firm

The default is Blank.

Classifier data source

You may specify an optional Classifier data source. This is a table in DLD format containing either two or three columns: TOKEN, SYMBOL, and (optionally) GENDER.

Parsing/gender data source

You may specify an optional Parsing/gender data source. This is a table in DLD format containing either two or three columns: TOKEN, SYMBOL, and (optionally) GENDER.

Parsing behavior

Parameter

Description

Use large table

Select to use Data Management's comprehensive parsing lookup table. If you are resource-limited, you should leave this off.

  • Default: no

Treat "/" as AND

If the name field might contain two names separated by a slash ("/"), select this option to ensure that the name is parsed correctly.

  • Default: no

Fix reversed First/Last

Select this if you suspect that your records may have First name and Last name reversed.

  • Default: no

Preserve dual LastName

If the name field might contain names with two last names, you can select this option to put both in a single last name field. If you have the name Mary Andrews Smith, selecting this option will write Andrews Smith to the OUT_LNAME1 field. If this option isn't selected, Andrews will be written to the OUT_MIDNAME1 field and Smith will be written to the OUT_LNAME1 field.

  • Default: no

Split hyphenated LastNames

If the name field might contain names with hyphenated last names, you can you can select this option to store hyphenated last names in separate fields. If you have a last name of "Watson-Jones", selecting this option will write "Watson" to the OUT_LNAME1 field and "Jones" to the OUT_LNAME1_2 field. If this option isn't selected, then "Watson-Jones" will be written to the OUT_LNAME1 field.

  • Default: no

Parse suffix types

Select this option to distinguish between generational suffixes such as Jr and III, suffixes such as DR and PhD and professional titles such as Finance Manager. An input name of the form "James Smith III, MD" will be output with "III" in the OUT_POSTNAME1 field and "MD" in the OUT_SUFFIX1 field. The name "Janice Jones, PhD, VP of Development" will be output with "PhD" in the OUT_SUFFIX1 field and "VP of Development" in the OUT_PROFTITLE1 field. Without this option checked, "PhD" and "VP of Development" would both go to the OUT_SUFFIX1 field.

  • Default: yes

No punctuation in titles

Select to remove the punctuation from honorary titles. This will strip the periods in titles like M.D.

  • Default: no

Capitalization

Choose capitalization style of the output.

  • Default: original

Treat "President" as

You can treat the word "President" as either Title or Prefix. If you select Title, then "President" will be put in the OUT_PROFTITLE1 field.

  • Default: title

Treat "C O" as "C/O"

Select to interpret the string "C O" as "Care Of".

  • Default: yes

Prefix options

Parameter

Description

Add prefix if none present

Select to add a prefix such as "Mr" or "Mrs" to names that don't have one. Use the other prefix options (below) to specify the default prefix.

  • Default: no

Default male prefix

Select a default male prefix from the list.

  • Default: MR

Default female prefix

Select a default female prefix from the list.

  • Default: MS

If multi-name use alt female prefix

Select to assign a prefix to the female name if a pair of parsed names includes a female.

  • Default: yes

Alt female prefix

If you selected If multi-name use alt female prefix, select a default female prefix.

  • Default: MRS

Configure AO Classifier

  1. Select AO Classifier.

  2. Go to the Fields tab on the Properties pane.

  3. Select Input field and choose the field to parse.

  4. In the Select output fields section, choose each field that you want to include on output.

  5. Select the Options tab, and then configure parsing behavior.

  6. Optionally, you may specify one or more additional parsing data sources. See Adding a classifier data source and adding a parsing/gender data source.

  7. Optionally, go to the Execution tab, and then set Web service options.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.