Redpoint's approach to hygiene, matching, and identity resolution
Redpoint’s approach to Identity Resolution (IDR) uses sophisticated and comprehensive hygiene procedures and probabilistic techniques to match individuals and households using a variety of available data. This enables the unique identification of individuals and the detection of their corresponding associations. IDR processes new and updated records against previously identified matches and multiple disparate sources of data can be processed simultaneously.
The Redpoint IDR process flow involves:
Data collection
Hygiene, enhancement, and standardization of Personally Identifiable Information (PII) data
Match process
Match ID assignment
Output / storage
Redpoint’s IDR process is non-destructive, meaning that it:
Maintains the original source input data
Supplements and enhances the PII data
Assigns a persistent
INDIVIDUAL_ID
andHOUSEHOLD_ID
to each customer account.
These keys are of paramount importance to your business; they allow you to:
Identify an individual or household
Reduce marketing spend from sending duplicate messages
Identify any cross channel / brand activity
Support marketing compliance
Data collection
Your data may come from a variety of sources, each with their own layouts, therefore Redpoint collates the data into several distinct entities, each allowing for multiple sets of PII data for each customer account (such as multiple addresses per account).
Customer accounts
Names
Addresses
Phone numbers
Email addresses
Social accounts / IDs
Custom attributes
Hygiene
In order for the matches to performed efficiently and accurately, hygiene procedures are performed against the data sets.
Phone hygiene
Phone numbers are adjusted to the E164.3 international standard. If country codes are not available, the country names/codes of the address are used. Unnecessary characters and data are removed, including:
+
as a dialing prefixSpaces and any non-numeric characters (brackets, parentheses, hyphens, periods, etc.)
Extension numbers
Prefixing group characters when international numbers are used (often
0
or1
)
Country | Source Phone Number | E164.3 international number |
---|---|---|
US |
|
|
UK |
|
|
FR |
|
|
Email hygiene
Email addresses are validated for proper format and syntax, and common errors are corrected:
Incorrect characters (e.g., comma instead of period)
Remove spaces
Misspellings (e.g.,
yahooo.com
→yahoo.com
,.con
→.com
)
Source Email Address | Output Address |
---|---|
|
|
|
|
|
|
For matching purposes only, superfluous characters and strings are removed from Gmail addresses, such as:
Source Email Address | Output Address |
---|---|
|
|
|
|
|
|
Email verification (deliverability and domain availability) is not performed.
Address hygiene
Addresses are parsed, corrected, completed, and standardized against country-specific postal databases. The benefit of performing hygiene on addresses is two-fold; improved address quality (and subsequently improved deliverability) and improved match quality.
Suppression lists
PII attributes can often contain data that leads to erroneous matches or incorrectly collapses large sets of records into a match group. These data are usually a result of test data, system defaults, or bad data provided by the customer. Redpoint IDR provides the ability to use “suppressions” as a way to suppress those values from the matching process, alleviating erroneous matches. Frequently recurring values is the most common mechanism for identifying possible suppressions, but Redpoint IDR suppression list capability provides additional capabilities to suppress data attributes from the matching process that may not be as widespread, but still can lead to invalid results. Suppression lists are comprised of:
Phone
Phone number
Pattern
%-555-%
Email
Email address
noemail@noemail.com
Domain
fakeemail.com
Username
sales, contact, info, admin
Address
Exact
Suppression is contained:
At the beginning of the address
Anywhere in the address
Matching process
Matching involves comparing records both within the source itself and to the customer master in order to determine match groupings. While comparing every incoming source record to all other source records and against every customer master record is ideal, it is not pragmatic. As the customer master increases in size, the number of comparisons increases exponentially, resulting in a huge number of comparisons performed, and yielding little-to-no benefit, as most comparisons would never result in a match.
Redpoint’s IDR optimizes the process by utilizing loose groupings to identify match candidates rather than processing incoming records against the entire customer master.
Once a full set of match candidates is found, a tighter set of matching rules are applied based upon a comprehensive set of match rules. The match rules employed in the IDR process are configurable with the exhaustive set of rules being:
Full name / address
Full name / phone
Full name / email
Full name / social
Full name / custom attribute
Email only (in the case that no other PII element is populated)
Phone only (in the case that no other PII element is populated)
First name + email + mobile phone
First name + email + address line 1 + postal code 1
First name + mobile phone + address line 1 + postal code 1
The example below shows:
Expansion of the new customer set to match against
Identity of possible customer data groups to allow the loose selection of match candidates
Matching candidates applied against the new match sets
Identity resolution results
Match ID assignment and management
Persistence
The persistence of match IDs is critical to the IDR process and being able to track the journey an individual/household takes over time. As match groups are formed, a match ID is assigned, and that ID does not change unless data or an event warrants a change to the match group. Match IDs are assigned to all records, regardless of whether they are part of a match group or are a singleton. Match IDs become the master keys representing people, households, or any other matchable entities.
Reassignment and merging of individuals/households
IDR can intelligently merge separate individuals and/or households together, when a new record(s) provides information linking separate match groups together (e.g., NCOA, record augmented with new data, etc.). IDR also supports manual intervention so that separate customer accounts can be forced to merge to a single individual.
Break-aparts and forced matches
Similarly, IDR can split an individual/household into separate individuals/households when new records are loaded or existing data changes, causing ambiguity within the match group. IDR also supports manual splitting.
Output
The final stage of the IDR process is to add/update the customer master tables and primary tables within the CDP, ensuring each customer account is assigned an individual ID and household ID.