Glossary
On this page:
Industry terms
The tables below describes terms that are relevant to Redpoint’s industry and the industries of our clients.
General
The table below describes general terms that are relevant to Redpoint’s industry and the industries of our clients.
Term | Definition |
---|---|
Ad suppression | An approach to marketing that intentionally limits or prevents advertisements from displaying to certain users or in specific contexts. Examples:
|
Anonymous visitor | A user who visits a website, but doesn’t sign up to receive information or purchase anything from the website. They’re often more than 95% of website traffic. |
Append | In marketing technology, refers to the process of enriching or supplementing existing customer data by adding new information from another source. It's essentially like adding missing pieces to a puzzle to create a more complete picture of your customers. |
CBOR (Concise Binary Object Representation) | A binary data serialization format that’s loosely based on JSON. Like JSON, it allows the transmission of data objects that contain name–value pairs, but more concisely. This increases processing and transfer speeds with the cost of human readability. |
CCPA (California Consumer Privacy Act) | A statute that enhances privacy rights and consumer protection for residents of the state of California. The CCPA provides California residents with the right to:
|
CDN (Content Delivery Network) | A network of servers that make downloading files faster for the user by placing them all around the world to reduce data transit time. When you have servers in a CDN, they’re much closer on average to the end user. |
Channel | In marketing, refers to a specific avenue or platform you use to communicate your marketing message and reach your target audience. |
Column headings | The first row of a spreadsheet often contains column headings. In many applications, the user can identify this row and specify that the headings be used as field names. |
Cookies | Small pieces of text that are stored by the browser on a website. Cookies have a name and a value. Every time you make a request to a website, you send along the cookies that you have stored. It’s how the website figures out your login information. It’s similar to a passport as a form of ID. When you first enter your username and password, the website generates a secure cookie, and tells your browser to store it. From then on, your browser sends along the cookie as a way of identifying you to the website. |
Customer Data Platform (CDP) | While multiple definitions exist, a CDP is a commonly known software solution that collects and organizes customer data to be accessible in a single place. Specifically, this packaged software is designed to collect, integrate, manage, and store customer data from disparate internal and external systems, sources, and applications. (For example, a CRM, ESP, POS, and so on). Once organized, data is used to create a unified view of the customer that can be used for insights, analytics, reporting, and to activate or orchestrate relevant omnichannel experiences. The benefit of deploying a CDP is that it helps improve personalized interactions with customers (at a 1:1 level). Learn more about Redpoint’s CDP solution. |
Data dictionary | A reference document that provides definitions and details about the data elements used within a specific dataset or project. |
Data model | A visual representation of the solution’s database:
For simplification, data models are divided by logical subject areas, each represented by a separate ERD (entity–relationship diagram). By helping to define and structure data in the context of relevant business processes, data models support the development of effective information systems. They enable business and technical resources to collaboratively decide how data will be stored, accessed, shared, updated, and leveraged. |
Data structure | An organizational scheme, such as a record or array, that can be applied to data to facilitate interpreting the data or performing operations on it. |
DDL (Data Definition Language) | A standard for commands that define the different structures in a database. DDL statements create, modify, and remove database objects such as tables, indexes, partitions, and users. Common DDL statements are CREATE, ALTER, and DROP. |
Demographics | Statistics describing aspects of a population, such as age, sex, race, religion, income, and geographic location. |
Direct mail | Marketing strategy that sends offers and advertising through print mail. |
Field size | The maximum length of a data field. |
Field type | The kind of data a field can contain. Examples:
|
Field value | The data contained in one field of a record. If no data is present, the field is considered blank. |
FIPS (Federal Information Processing Standards) | Publicly announced standards developed by the United States federal government for use by all non-military government agencies and by government contractors. FIPS state codes are numeric and two-letter alphabetic codes identifying U.S. states and certain other associated areas. The FIPS county code is a five-digit code uniquely identifying counties and county equivalents in the United States and possessions. The first two digits are the FIPS state code and the last three are the county code within the state or possession. County FIPS codes are usually in the same sequence as alphabetized county names within the state. They’re usually odd numbers, so that new or changed county names can be easily accommodated. |
GDPR (General Data Protection Regulation) | A data protection and privacy law in the European Union that gives individuals control over their personal data. |
IDFA (Identifier for advertisers) | A random device identifier used by Apple that tracks and identifies a user without revealing personally identifiable information (PII). |
JDBC (Java Database Connectivity) | Uses the Java programing language to define how a user may access a database. |
JWT (JSON Web Tokens) | A compact, URL-safe means of representing claims to be transferred between two parties. |
Merge/purge | To combine two files into one such that duplicates are recognized and eliminated. |
MSRP (manufacturer suggested retail price) | The price before shipping costs, taxes, and/or discounts have been applied. MSRP is sometimes referred to as the base price. |
OAuth (Open Authorization) | An open standard for access delegation, commonly used to grant websites or applications access to information on other websites. |
Parent-child relationship |
|
Personally identifiable information (PII) | Any data that can be used to identify someone. All information that directly or indirectly links to a person is considered PII. |
RFM (Recency, Frequency, Monetary) | A marketing assessment of the quality of a customer, used to evaluate sales potential.
|
Sort | Arrange records in a file by alphabetical or numerical sequence. |
SQL (Structured Query Language) | Commonly pronounced "sequel", it’s the standard database language used in querying, updating, and managing data in relational databases. |
Suppression file | A file containing a list of contactable entities (addresses, phone numbers, emails, etc.) that you shouldn’t contact. |
UTM (Urchin Tracking Module) parameters | Metadata (URL parameters) at the end of a link to allow for better attribution, assessment of campaign performance, and personalization. |
WSDL (Web Services Description Language) | An XML-based interface definition language for describing a web service in terms of the messages it sends and receive. A WSDL 2.0 service description indicates how potential clients are intended to interact with the described service. |
Third-party companies and products
The table below describes third-party companies and products used alongside Redpoint products.
Company/product | Description |
---|---|
Acxiom | A third-party company that collects, analyzes, and sells information about customers and businesses for use with targeted advertising campaigns. |
Adobe Magento | Magento is an open-source e-commerce platform written in PHP. Magento source code is distributed under Open Software License. |
Amazon Kinesis Data Firehose | A fully managed service for delivering real-time streaming data to Amazon S3. |
Amazon Marketing Cloud | A secure, privacy-safe, and cloud-based clean room solution. Advertisers can perform analytics and build audiences across pseudonymized signals, including Amazon Ads signals as well as their own inputs. |
Amazon Pinpoint | A flexible and scalable communications service for inbound and outbound marketing. Connect with customers over a variety of channels, such as email, SMS, push, voice, in-app messaging using this product. |
Amazon Redshift | Amazon's hosted data warehouse product. |
Amazon S3 | Amazon Simple Storage Service (S3) is a hosted cloud data storage service accessed via web services interfaces. The Amazon S3 basic storage unit is an object. Objects are organized into buckets. |
Apache Avro | Avro is a data serialization and data exchange framework. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. |
Apache Parquet | A columnar storage format used by many query engines for analytics workloads. Parquet features per-column compression and encoding schemes that offer significant performance benefits compared to a traditional row oriented format. |
Aurora | Aurora is a hosted MySQL and PostgreSQL compatible relational database service offered by Amazon. |
AWS (Amazon Web Services) | An Amazon subsidiary that provides on-demand cloud computing platforms. AWS hosts numerous products and services, including the Aurora relational database, the Redshift data warehouse product, and the Simple Storage Service (S3). |
Azure | Microsoft Azure is a cloud computing platform. It offers access, management, and the development of applications and services through global data centers. |
Azure Blob Storage | An object storage solution for the cloud that is optimized for storing massive amounts of unstructured data. |
Azure Data Factory | A service in Azure that can convert any data format into another data format, such as converting Apache Parquet to CSV. |
Braze | A leading marketing automation platform that allows users to create custom experiences based on sophisticated customer attributes and segments, then map those experiences to campaigns. |
Databricks | Databricks, Inc. is a global data, analytics and artificial intelligence company founded by the original creators of Apache Spark. The company provides a cloud-based platform to help enterprises build, scale, and govern data and AI, including generative AI and other machine learning models. |
Google Ads | Search-based advertising that can be run across the Google advertising network and is shown to web users. Use search-based advertising to promote your brand, help sell products or services, raise awareness, and increase traffic to your website or stores. |
Google Pub/Sub | A low-latency messaging service that can be configured within Google Cloud to stream data (including real-time) to Google Cloud Storage. |
Listrak | Enables personalized cross-channel interactions that help automate campaigns, build customer loyalty, and increase conversion rates. |
LiveRamp | Allows clients to combine customer data from various online and offline sources, centering around the use of web cookies that allow websites to remember visitors. |
Loqate | A real-time address validation and verification solution that provides the ability to capture and validate/verify addresses/changes to postal addresses and provide insight about a specific location |
Mailchimp | A cloud-based marketing automation platform and an email marketing service that provides an API for integrating with third-party systems and a web UI for managing email contacts, templates, and lists. |
Meta Ads Manager | A unified ad creation tool that your brand can use to create and publish ads to Facebook, Messenger, Instagram, and the Meta Audience Network. |
Microsoft Advertising | A pay-per-click advertising platform that displays ads based on keywords in a user’s search query. |
A social media platform home to unique communities, engaged conversations, and more. Reddit Ads allows brands to find their community on Reddit, and then engage with your customers within the 100K+ active communities on Reddit using targeted ads and promoted posts. | |
Salesforce | A cloud-based software company that provides customer relationship management software and applications focused on sales, customer service, marketing automation, e-commerce, analytics, and application development, including:
|
Snappy | A compression library that aims for high speeds and reasonable compression instead of maximum compression. Files that are compressed with Snappy tend to be larger, but the process to compress (and decompress them) is significantly faster. |
TikTok | The world’s leading destination for short-form mobile videos. TikTok’s mission is to capture and present the world’s creativity, knowledge, and moments that matter in everyday life. |
X12 | An Electronic Data Interchange (EDI) standard developed by the Accredited Standards Committee (ASC) of the American National Standards Institute (ANSI). It is commonly used in the USA, while most of the rest of the world uses the EDIFACT (United Nations Electronic Data Interchange for Administration, Commerce and Transport) transaction sets. |
YouTube | An online video platform on which your brand can run in-stream, bumper, video, and discovery ads to build interest, brand awareness, and inspire your customers to take action. |
Mailing terms
This table describes terms associated with mailing and personal addresses.
Term | Definition |
---|---|
Address standardization | The process of taking an address and verifying that each component meets United States Postal Service guidelines for addresses. For example, "123 Main Avenue" should be abbreviated as "123 MAIN AVE". During standardization, minor misspellings, dropped components, and abbreviations are all corrected. The correct city, state, and ZIP code are also provided. |
Bulk mail | Second-class, third-class, and fourth-class mail, serviced on a non-preferential basis by the United States Postal Service. |
Carrier Route | A 4-byte code assigned to a United States Postal Service mail delivery or collection route within a 5-digit ZIP Code. The first character of this identification is alphabetical, and the last three are numeric. The alphabetical character has the following meanings:
|
CASS (Coding Accuracy Support System) | A service offered by USPS to mailers, service bureaus, and software vendors that improves the accuracy of DPV codes, ZIP+4 codes, 5-digit ZIP Codes, and carrier route information on mail. CASS Certified mailings qualify for substantial postage discounts. |
DPV (Delivery Point Validation) | A United States Postal Service data product that checks whether a ZIP+4 coded address is a known and deliverable address record. |
DPV false positive addresses | The United States Postal Service includes false positive addresses in their DPV directories as a security measure to prevent DPV abuse. |
eLOT (Extended Line of Travel) | A United States Postal Service data product used to sort mailings in approximate carrier-casing sequence. eLOT contains a sequence number field and an ascending/descending code. The sequence number indicates the first occurrence of delivery made to the add-on range within the carrier route, and the ascending/descending code indicates the approximate delivery order within the sequence number. eLOT processing may be used by mailers to qualify for enhanced carrier route presort discounts. |
Finance number | A code assigned to United States Postal Service (USPS) facilities to collect cost and statistical data and compile revenue and expense data. The state number comprises the first two positions of the finance number. The finance number can be used to match to records in other USPS files. By sorting these files by finance number, sequence matches can be made to use other street-level address information. |
LACS (Locatable Address Conversion Service) status indicator | Records that have been converted to the LACS system, a United States Postal Service product that allows mailers to identify and convert a rural route address to a city-style address.
|
PMB (Private Mail Box) | Non-USPS. Distinct from PO Box, which United States Postal Service reserves for the boxes located at USPS post offices. |
Urbanization | An area, sector, or development within a city (Puerto Rico only). |
ZIP+4 | United States Postal Service nine-digit code for a particular block, building, apartment, or business location. An average ZIP+4 area contains 10-15 households. |
Redpoint product terms
The tables below describes terms that are specific to Redpoint products.
All products
The table below describes terms used across multiple Redpoint products.
Term | Definition |
---|---|
Audience | Composed of a series of segments, audiences wrap up the logic involved in determining to whom messages are to be delivered when executing your interactions. |
Campaigns | Activation workflows executed in Redpoint Orchestration that utilize data for targeted marketing. These campaigns consist of triggers, audiences and segments, resolution, and suppression rules. For example, an offer for 25% off or a birthday email. |
Cap value | Allows the user to limit the number of records pulled in a given split either by volume (a set number, like 500) or percentage (for example, only 10% of the total selection for that split). Typically, this is done in the context of A/B testing or where there is limited inventory (for example, I have only 500 of these super-special coupon codes, so I want to limit the selection to 500 people). |
Golden Record | The Golden Record is an aggregate of information related to an individual. Records containing personally identifiable information (PII) are assigned a unique identifier, Individual ID, through a matching procedure. Consequently, multiple records may correlate to one person (Individual), reflecting various PII details, such as an individual having two distinct email addresses linked to them. The Golden Record consolidates data from multiple sources into a singular, authoritative record for an individual or individual and business unit. This master record contains the "Best Data" selected based on predefined rules. For instance, if a person has two email addresses, only one is chosen for the Golden Record, potentially using criteria such as the most recent email with an opt-in status. The Golden Record links various aggregates in the system to get a 360 degree view of the customer data and actions for the individual. |
Split | A way to uniquely identify and edit down a subset of your audience, whether for downstream reporting/analytics, different treatment, or both. |
Redpoint CDP
The table below describes terms from the Redpoint CDP product.
Term | Definition |
---|---|
Aggregates | The calculations specific to core and each vertical that define the data that resides in the summary tables and is available for segmentation; the aggregates all have specific data elements passed in feed layouts required for processing. Examples:
|
Base tables | The main tables in the database that contain the base values used for aggregate processing. |
Core data model | Contains objects grouped in logical subject areas that are not specific to any industry vertical and can deployed on their own, such as Identity Resolution, Customer, Contact Authorization, Campaign & Response Event data, etc. Learn more about the Data model. |
Data Inventory | A comprehensive list of the data required to construct the CDP.
|
Database quality metrics |
|
Extension Tables | Tables where additional data can be ingested. This data can be exposed in orchestration but is not part of the aggregate processing without customization. |
Feed | A logical grouping of data in an accepted format (e.g., file, database table, queue, etc.). Feeds are subsets of sources particular to a data grouping. A feed is the atomic unit of Redpoint CDP, the smallest unit of data that the CDP handles. Think of a feed as:
Examples include:
|
Feed frequency | A value assigned to a feed that sets the time interval between feed run starts. If a feed starts execution after its feed frequency has expired, the feed is stale.
|
Feed layout | A standard file layout that is used for data ingestion. Each feed layout is designed to capture data related to the specific business subject area, such as customer profile, loyalty account, retail product, retail transaction, etc. A feed layout contains a list of standard data elements that should be consumed as well as the mapping of each data element to the table and column within the database along with the transformation rules, validation rules, data population rules, and valid values.
Refer to Data ingestion basics and Data ingestion details for more information. |
Feed run quality metrics |
|
Feed state | Explicitly set by a user. Possible values are:
|
Feed status | A process-generated value based on the last instance of a feed run. Possible values are:
|
Feed staleness | A feed is stale if the next feed run has not started within the specified frequency. This is calculated on the fly based upon the There are two causes for a stale feed:
|
Format validity: Address | Redpoint CDP:
For US addresses, Redpoint CDP provides a reason why an exact match was not found (for example, a missing suite number). |
Format validity: Email | For Redpoint CDP, the valid email format is Redpoint CDP:
The overall length of the email address may not exceed 320 bytes. |
Format validity: Phone | Redpoint CDP uses a third-party phone format validator to determine if the number is valid. Note that the validator does not determine if the number is in service. This format is based on E164. |
Ignore | This term refers to records that have custom rules to "ignore" them during data processing. Such records may be perfectly valid but the customer does not want them loaded. For example, an ignore rule could be attached to a set of records such that if the DOB field indicates that a customer is under 18, the associated record will not be loaded. |
Industry vertical data model | Contains objects grouped in Logical subject areas that are specific to this Industry vertical.
Learn more about the Data model. |
Lookup tables | Tables in the CDP that contain all of the lookup values. Examples:
|
Lookup values | What you need to provide in your Feeds so that the values are consistent and can be used in aggregate calculations. If the value that you provide in the feeds is not found in the lookup tables, the record will be rejected. |
Match candidates | The matching process is predicated on initially finding match candidates (possible matches) for the incoming records. Frequently this number will be significantly higher than the number of input records depending on the quality and consistency of the input data. For example, an incoming record with just a first initial, last name, and email address may have more match candidates than a record with a full first and last name, street address, and email address. |
Match field | Indicates that the attribute provided will be used in the Identity Resolution process. |
Matching: Data quality address scores | Review address hygiene valid format for an explanation of the term.
|
Matching: Data quality email score | Review valid phone format for an explanation of the term.
|
Matching: Data quality phone score | Review valid phone format for an explanation of the term.
|
Matching: Data quality score |
|
Matching: Data quality scores | The process of assessing, monitoring, and improving the fitness of an organization's data for business use. Fitness is generally a measure of the accuracy, completeness, accessibility, and timeliness of data. Data quality includes capabilities for processing data (integration, parsing, cleansing, matching, and so on) and for assessing data (monitoring, profiling, reporting). Data Quality is also closely related to data governance (metadata, security/privacy/compliance, performance, and so on). Data quality is used by an organization to understand and improve outcomes that depend on high-quality data, such as customer experience or supply-chain management. |
Priority | When you define a source, you set the source's display priority, 1 (highest priority) to 999 (lowest priority). When source names are displayed by priority (such as on the CDP Summary chart) they are listed by priority number (highest priority number to lowest priority number). |
Quarantined | A record had one or more validation issues and has been removed from the process and written to a file in the quarantine directory. If the incoming feed has duplicates based on the defined primary key, the "extra" records are removed from the process, counted, and reported via the UI. Records that are removed from the process based on a business rule are ignored. |
Reference data | Used to ingest lookup values that are specific to the client (Business Unit, Brand, Payment Type, Sources, etc.). These values will be included in SAF validation rules, loaded into appropriate lookup tables, and might require business rule adjustments. |
Segment | The most basic building block for targeting customers. |
Source | A way to group and categorize feeds (i.e., a source system); it is comprised of one or more feeds. A source could consist of a single feed or a hundred feeds. Examples:
|
Source status | This value is an indicator of how well your source is performing. The status of an individual source is based upon the statuses of its included feeds.
|
Source to feed layout | The process during which all required data attributes are mapped from their source to a relevant feed layout. This is the first deliverable in the onboarding process. For example:
Refer to Data ingestion basics for more information about this process. |
Summary tables | The tables where the data summaries (aggregations) reside. |
Suppression | Allows you to edit your audience down to your essential customers. A suppression is a predefined segment added to the audience during audience definition, but in this case the segment rules exclude customers from the audience. |
Table metadata | Information about Table schema and Names, Column names, datatypes, Nulls, Primary, Foreign, Alternate Keys, Referential Integrity Constraints, etc. Produced by Data Architect utilizing Erwin data modeling tool. |
Tables and table detail count metrics | Calculated from all the feed runs using the following time values and table change counts:
|
Target | Used in segmentation and activation screens, with regard to the recipient(s) of a given offer or records being targeted in an AdTech campaign, for example. When building a segment, you define the target as an Individual, a Household, a Phone Number, etc.—this gives you the granular control needed to do something like only send one catalogue per household (using Household targets), or one SMS to a phone number (using Phone Number target, since an individual could have multiple phones). |
Valid ranges | Similar to valid values, but instead of a given number or percentage, reference a range of values. For example, for the Credit Score field the range is 300-850 and any value outside that range indicates a problem. |
Valid values | The database field values that can be used for a particular column (given in the lookup table). For example, a field like Business Unit would have valid values to prevent bad data from being loaded since this is a key field. Using valid values reduces data entry and keeps things consistent. |
Redpoint Data Management (RPDM)
The table below describes terms from the Redpoint Data Management product.
Term | Definition |
---|---|
Automations | Data Management "meta-projects" that use stepwise execution to integrate Data Management projects with other steps such as executing external programs, waiting for user review, transferring files via FTP, and determining the format of a file. |
DM repository | A hierarchical storage mechanism for managing Data Management objects, including Data Management projects, macros, formulas, schemas, and data files, and connections. It is similar to Windows Explorer or other file systems. |
Site server | Designed for centralized management of Data Management-based processing. It hosts the repository and supports user/group object security, version control, scheduling and unattended operation, scalability, reporting, and logging. The Site Server monitors the Execution Server on each computer within the site or cluster, querying it for load, and choosing appropriate computers on which to spawn project servers. The Server's object-sharing capabilities support reuse of projects and formulas among a group of Data Management users. |
Tools | Each icon on the Data Management Palette represents a tool that performs a specialized operation. By configuring and connecting these tools, you create a custom data processing "engine."
|
Redpoint Interaction (RPI)
The table below describes terms from the Redpoint Interaction product.
Term | Definition |
---|---|
Alignment | The alignment of data within a field. Alignment can be left, right, center, or on a decimal point. |
Assets | Text, HTML, image, and web form content. Use assets to build offers, which deliver the content specified within the assets to your intended audience. |
Cell list | Defines a series of cells and is used to configure a cell list block in an audience. Cell lists are stored as files in the RPI file system, the same way as any other file type. A cell list block can be used to split an audience’s output into a series of discrete cells, each of which may then be handled separately in an interaction workflow. Each cell list block is based on the cell structure defined by its cell list. In this way, the same cell list can be used to configure many audiences. |
Dashboards | Allow you to collate a series of widgets, and display them within a single tab in the RPI user interface. |
Export template | Allows you to define the structure of data exported from RPI. This might occur during execution of an interaction workflow, or in the Rule Designer. Using the Export Template Designer gives you considerable flexibility in the manner in which export files are structured. |
Intelligent Orchestration | The consistent delivery of relevant, contextually aware, and hyper-personalized next-best-actions across all customer journey stages and all enterprise touchpoints. |
Interactions | Allow you to deliver targeted messages to an audience selected from the records within your data warehouse and maintain an ongoing dialogue with that audience through time and across multiple channels. |
Jobs | Certain actions undertaken in RPI result in the creation and execution of jobs on the RPI server. Examples include running a selection rule’s count, downloading an export file, and synchronizing the catalog with the data warehouse. |
Landing pages | Hosted in a website for the purpose of capturing data from both anonymous visitors and those directed to the site via outbound RPI communications. |
Realtime Decisions | Used to make decisions on the type of content to be displayed in a smart asset. They can be used to determine applicability in content elements in a smart asset; when the criteria within a content element’s Realtime decision are met, the visitor is served the associated content. Realtime decisions use a memory cache that allows for the persistence of a visitor profile for each known and unknown website visitor. The data stored in the cache can be a combination of properties determined from the visitor’s browser session, including goals achieved at landing pages, information advised by Facebook, knowledge gleaned through traversing a customized link in an RPI outbound offer and attribute values retrieved from the data warehouse or an auxiliary database upon the user choosing to identify themselves (e.g., by submission of an email address in a web form). |
Realtime Layouts | Allow RPI users to create a visual representation of a web page (a Layout), and then define elements therein—such as “banner”, “sidebar” and “carousel” (known as areas). Each of these can be associated with smart assets. Layouts and areas can be used to access these smart assets via the RPI Realtime API, serving content appropriate to a visitor’s specific requirements. |
Reporting Hub | Provides a single, consolidated context from which to undertake the following actions:
|
Selection Rules |
|
Single Customer View | The Single Customer View interface allows you to search for records from the data warehouse—typically customers—and then choose one of the records to view in detail in the Single Customer View interface. |
Smart Asset | Smart Assets intrinsically facilitate dynamic content personalization, unlike Asset files, in which content must be personalized by the inclusion of attributes or nested smart assets. |
Subscription Groups | Subscription groups allow subscribers to join a group. |
Webforms | Support various form elements that can be customized to capture different types of data. |
Developer terms
General
The table below describes helpful terms for developers or anyone reading developer content.
Term | Definition |
---|---|
Algorithm | A sequence of instructions that describe how to solve a particular problem. |
Alphanumeric | Consisting of letters or digits, or both, and sometimes including control characters, space characters, and other special characters. |
API (Application Programing Interface) | A way for two or more computer programs or components to communicate with each other. It is a type of software interface, offering a service to other pieces of software. |
Arithmetic operators | Symbols or other characters indicating operations that act on one or more elements. The +, -, *, /, and ( ) characters are operators used to construct arithmetic expressions. |
Ascending order | The arrangement of a sequence of items from lowest to highest, such as from 1 to 10 or from A to Z. The rules for determining ascending order in a particular application can sometimes be very complicated (for example, capital letters before lowercase letters, or extended ASCII characters in ASCII order). |
ASCII (American Standard Code for Information Interchange) | Standard code, and sorting sequence, for representing characters as binary numbers used in microcomputers. |
ASCII data | A document file in ASCII format, containing characters, spaces, punctuation, carriage returns, and sometimes tabs and an end-of-file marker, but no formatting information. ASCII data may be either delimited or fixed. |
Binary | Number system that uses only the digits "0" and "1". |
Binary data | Fixed record length data containing arbitrary bytes or words, as opposed to a text file containing only printable characters (for example, ASCII characters with codes 10, 13, and 32-126). |
Bit | Smallest unit of binary data; can be "on" or "off" ("1" or "0"). |
BOM (Byte-order mark) | A Unicode character pre-pended to a text stream to signal byte order and the presence of Unicode characters. |
Boolean | A data type having only two possible values: True or False (Yes/No, 0/1). |
BSON (Binary JASON) | A binary form developed by MongoDB for representing JSON-like documents. Like JSON, BSON supports the embedding of documents and arrays within other documents and arrays. BSON also contains extensions that allow representation of data types that are not part of the JSON spec. |
Byte order (or endianness) | Refers to the convention used to interpret the bytes making up a data word when those bytes are stored in computer memory.
|
Code pages | Tables of values that describe the character set for a particular language. Check out the table that lists the code pages supported by International Components for Unicode (ICU). |
Cardinality | The number of unique elements in a dataset. A higher cardinality indicates a larger percentage of unique values, whereas a lower cardinality indicates a higher percentage of repeat values. |
Concatenate | To join sequentially (for example, to combine the two strings "good" and "morning" into the single string "good morning"). |
Constant | A specific, unchanging value. |
CSV (comma separated values) | A text file that has a specific format which allows data to be saved in a table structured format. Each row of the file equals a data record. The columns represent the fields. |
Data type | The kind of data a field can contain, for example: Text, Integer, Boolean, and Date. |
Database | An organized collection of information, stored as fields and records in tables and/or files. |
De-dupe or deduplication | To eliminate duplicate records within one or more files. |
Delimited ASCII data | Variable length ASCII data in which fields are separated by a special character (usually a comma or tab). Field entries are often surrounded by double quotation marks (" "), and records separated by a carriage return-line feed. |
Delimiter | A special character that separates individual items in a set of data. In the following example, commas separate the fields in a database record (each non-numeric field is enclosed by double quotation marks). "Armstrong", "123 Pine Street", "Toledo", "OH", 12345. |
Descending order | A sort that arranges items in descending order—for example, with Z preceding A and higher numbers preceding lower ones. |
ETL (Extract, Transform, Load) | A process that extracts data from outside sources, transforms it to fit operational requirements, and loads it into the end target (usually a database). |
Enhancement | The use of computer data to upgrade information contained in a customer or prospect list. Often referred to as "appending" data. |
Fixed ASCII file | An ASCII data file that has fixed field and record sizes, but no delimiters except possibly a record separator. |
Fixed-length field | Data file format in which each field is allocated to a fixed number of bytes, regardless of its actual length. |
Fixed-length record | Data file format in which each record is allocated to a fixed number of bytes, regardless of its actual length. All records are the same length, and there is neither a size prefix before records, nor a newline terminator after records. |
Flat file | A single table of data stored in a plain text format. |
Hash | A number generated from a string of text. Sometimes called a message digest, hashes are frequently used to ensure the security of transmitted data or messages. |
Header | An information structure that appears at the beginning of a data file and identifies the information that follows. |
HMAC (keyed-Hash Message Authentication Code) | This is a specific type of message authentication code that uses a cryptographic hash function in combination with a secret cryptographic key. |
Joda-Time | The de facto standard date and time library for Java prior to Java SE 8. Users are now required to migrate to java.time (JSR-310). |
JSON (JavaScript Object Notation) | A language-independent data format that is derived from (and structured similar to) JavaScript. |
Leading blanks or zeros | A zero that precedes the left-most digit of a number. One or more leading zeros may be used as fill characters in a numeric field. |
Mainframe formatted sequential file | A binary image of a mainframe file with variable length records. Learn more about mainframe formatted sequential files. |
MB (Megabyte) | While a megabyte is technically 1,000,000 bytes, Data Management uses the mebibyte convention, with megabyte containing 1,048,576 bytes (220 or 1,024 x 1,024 bytes). |
Mean | The arithmetic average for a group of items is the sum of the values of the items divided by the number of items. It is frequently used as a measure of location for a frequency or probability distribution. |
Median | The value of the middle item in a group when all the items are arranged in either ascending or descending order of magnitude. It is frequently used as a measure of location for a frequency or probability distribution. |
Merge | To combine two information files into one in a logical fashion (that is, according to certain sequencing requirements). |
Newline (also line ending, end of line, or line break) | A special character or sequence of characters signifying the end of a line of text. The character codes representing a newline vary across operating systems. |
Nth name selection | A fractional select unit that is repeated in sampling a mailing list. For example, "every 10th" would be a selection of records #1, #11, #21, and so on. |
Operator | A symbol or other character indicating an operation to be performed on a value or values. For example, the + operator represents addition, and the * operator represents multiplication. |
ODBC (Open Database Connectivity) | A driver-based system to define how any client may access any database. |
Record | A collection of database fields, each with its own name and type. |
Record layout | The organization of data fields within a record. |
Record number | A unique number identifying each record in a database or table. |
Redefines | This clause defines alternate data entities for the same data location in COBOL. |
Regular expression | A string of characters that defines a set of rules for matching character strings found in fields. |
Rational database | A database or database management system (RDBMS) that stores information in tables—rows and columns of data—and conducts searches by using data in specified columns of one table to find additional data in another table. In a relational database, the rows of a table represent records (collections of information about separate items) and the columns represent fields (particular attributes of a record). This method requires less storage space than a sequential or "flat" database, but can be slow to access. |
Schema | Formal description of the structure and organization of data within a database system. |
SDK (Software Development Kit) | This is a combination of libraries and used mostly in the context of building mobile or native apps. |
SFTP (Secure File Transfer Protocol) | A network protocol that provides file access, file transfer, and file management over any reliable data stream. |
SHA (Secure Hash Algorithm) | A family of cryptographic hash functions published by the National Institute of Standards and Technology (NIST). Data Management functions use two Secure Hash Algorithms:
|
String | A data structure composed of a sequence of characters usually representing human-readable text. |
SOM (Self-organizing feature map) | A self-organized projection of high-dimensional data onto a typically 2-dimensional (2-D) feature map, wherein vector similarity is implicitly translated into topological closeness in the 2-D projection. This produces a a regular grid that can be used to visualize and explore properties of the data. |
Syntax error | An error resulting from an incorrectly expressed statement. |
Table | A data structure characterized by rows and columns, with data occupying each cell formed by a row-column intersection. |
Trailing blanks or zeros | Fields in a data file may be larger in size than the value in that field so the value does not fill the entire field. There may be blanks or zeros occupying the bytes not occupied by the actual data. |
Unix time (also Epoch time) | A system for describing instants in time as the number of seconds before (negative values) or after (positive values) the baseline date/time of 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970. |
UUID (Universally Unique Identifier) | A 128-bit number that uniquely identifies information in a computer system. It is represented by 32 digits and four characters (hyphens) displayed in five groups separated by hyphens in the form of 8-4-4-4-12. For example: 123e4567-e89b-12d3-a456-426614174000. |
Variable-length field | A field that can vary in length according to how much data it contains. Fields are identified by sequence, rather than specific position. |
Variable-length record | A record that can vary in length because it contains variable-length fields, a variable number of fields, or both. There is neither a size prefix before records, nor a newline terminator after records. |
Virtual memory | A memory management capability of an operating system (OS) which uses hardware and software to allow a computer to compensate for physical memory shortages, by temporarily transferring data from random access memory (RAM) to disk storage. |
Windows time | A system for describing instants in time as the number of 100-nanosecond ticks since the baseline date/time of 00:00:00 Coordinated Universal Time (UTC), 1 January 1601. |
XML (eXtensible Markup Language) | A supported data format for customer data sources. |