Data ingestion basics

Overview

There is a general pattern to ingest data regardless of the data itself. This page calls out the basics of the pattern, and subsequent sections will continue to elaborate on these basics. We will start with how the data is represented in the platform, followed by what feed layouts are and how they are used when configuring sources and feeds for loading the data.

How is the data represented within the platform?

There is a standardized data model organized into subject areas.
Subject areas may be common or specific to industries/use cases/client requirements.
All Redpoint CDP deployments contain the core subject areas, which include Core feed layouts as well as additional industry- or client-specific data structures (refer to Retail feed layouts).
This data model is standardized but extensible. Below are a few groupings of objects that make up the data model along with a visual that shows how the objects support each other to materialize the CDP for a retail client. This pattern would look similar for other verticals, with the elements in green reflecting those industry specifics.
- Base and extensions tables
- Identity resolution
- Lookup tables
- Aggregations

What is a feed layout?

A feed layout is a template for importing data relative to a specific subject area (customers, transactions, loyalty, etc.).
- All expected fields
- Required/optional indicators
- Keys
There is a predefined set of feed layouts that include commonly used subject areas.
Many feed layouts include associated extension tables to support extensibility.
Custom feed layouts can be enabled for vertical-specific or client-specific use cases.
Feed layouts support ingestion by enabling you to map source data into a standardized input format that Redpoint CDP expects to initiate a standard processing flow.
- Validation
- Profiling
- Identity resolution
- Transformation
- Archival

Sources and feeds

Sources typically represent individual source systems (e.g., CRM, Ecommerce, POS)
Each source may provide data through one or more feeds (e.g., customers, transactions, locations)
All feed data must be provided based on a specified feed layout
Conduct an inventory of all desired sources and feeds as a part of the Redpoint CDP implementation process

Source and feed inventory

The following table is an example of how you can start to document your list of sources, feeds, and associated descriptions.

Source name	Feed	Source sescriptions
Salesforce	Customer Data	Customer account information, including PII (Personally Identifiable Information).
Salesforce	Transaction Data	Transactional data for known customers (ecommerce).
Salesforce	Marketing Preference Data	Email `Opt_in` and `Opt_Out` data associated with customer accounts.
Internal Warehouse	Loyalty Data	Loyalty account information.
Internal Warehouse	Transaction Data	Transaction data for physical stores (POS).
Customer Service	Account and Preference Data	Updated customer information or preference information related to a customer service interaction (e.g., “Please remove me from the Direct Mail communications”).

Source and feed inventory with feed layout alignment

The next step is to review the various sources and determine what data topic(s) and feed layout(s) that the source data aligns with.

Additional details on how to determine this alignment as well as next steps are discussed in the section Data Ingestion Details.

Source name	Feed	Feed layout target 1	Feed layout target 2
Salesforce	Customer Data	Party Profile	Customer
Salesforce	Transaction Data	Transaction	N/A
Salesforce	Marketing Preference Data	Contact Authorization	N/A
Internal Warehouse	Loyalty Data	Account	Party Profile
Internal Warehouse	Transaction Data	Transaction	Tender
Customer Service	Account and Preference Data	Account	Contact Authorization

At a high level, these are your next steps:

Take the list of sources and target feed layouts and define a detailed mapping of the source data fields to the associated feed layout.
Take your source data and feeds and convert it to a feed layout.
In the Redpoint CDP Web UI, define each of the sources and feeds and provide some additional details related to the feed, such as the file names and the cadence that the source feed will be sent to Redpoint CDP.
Create a sample file and provide it to the platform for data ingestion. We recommend that you create a small sample file of 5-10 records to start, choosing records that represent a few different types of values for each of the fields that are being provided so that you can establish a baseline validation for each feed layout that will be provided.

Customer-specific lookup tables and associated values will need to be updated before data can be ingested for a feed layout that requires customer-specific lookups.

Additional details related to testing feed layouts and loading data will be provided in other pages in this onboarding section.

The following diagram outlines the steps that we have discussed up to this point.

Data Ingestion Basics - High Level Diagram

Data archival is completed after the transformation process has finished. We archive the original file, and if the original file is encrypted, the archive is encrypted as well. We also archive match inputs/outputs, which also follow the encryption status of the original files. These three files are archived in the cloud storage location configured for archival. Any records or files that do not pass validation are returned to the /output folder of their original location and follow the encryption status of the original file as well.