Redpoint Data Management™ Documentation
Breadcrumbs

Complex file formats

Overview

Many (mostly legacy) file formats represent multiple record types as a nested hierarchy in a single file. These file formats evolved to support Electronic Data Interchange (EDI) in both standard and custom forms. Data Management supports reading and writing many of these file formats, including delimited, fixed-width, pattern-based, and XML variations.

Common EDI standards include:

  • ASC X12

  • EDIFACT

  • TRADACOM

  • HIPAA / HL7

Custom formats typically support data transmission for bill printing, financial reconciliation, inventory reporting and more generalized database transmission.

Because there are multiple logical record types within a single complex-format file, such files often contain obvious record tags at the beginning of each line or segment. For example, the first column of a delimited file may contain a record type code, which is used to indicate the schema for the type of record on that line:

100,Some text for a header
101,Ralph Schmidt,123 Main St,Anytown New York
110,Credit,123.45
110,Debit,23.10
200,Summary note

In the sample above, the record type codes (100, 101, 110, 200) are used to indicate the type of record (Header, Transaction, Trailer) in a hypothetical file format that consists of a Header record, followed by any number of Transaction records, followed by a single Trailer record. All of the EDI standards listed above follow this paradigm.

Another variation is to use fixed-width formatting:

100 Some text for a header  
101 Ralph Schmidt      123 Main St    Anytown New York
110 Credit   123.45
110 Debit    23.10
200 Summary note

In some cases, there are no clear record type indicators. Instead, record types are inferred through relative position and patterns:

Some text for a header
    Ralph Schmidt
    123 Main St, Anytown New York
Credit   123.45
Debit    23.10
Summary note

XML is similar to these EDI formats, as it can contain text-encoded nested record structures. Data Management supports conversion between all of these formats as well as direct analysis and manipulation of the discrete record types contained within the hierarchy.