Concepts and terms
This section introduces many of the concepts you'll encounter when working with the Redpoint Data Management SDK.
DataType
Data Management supports the following data types:
IntegerFloatDecimalBooleanVariable TextFixed-length TextUnicodeBinaryDateTimeDateTimeSpatialDocument(version 8.2 and later)
A note about the Spatial data type
Currently, the SDK recognizes the RPDM Spatial data type—but SDK tools can only manipulate Spatial fields as byte[]. This necessarily limits SDK tools to copying values. The ability to work directly with Spatial fields will be added in a future release.
The SDK maps Data Management data types to Java types as follows.
Data Management type | Java type |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Not yet implemented in version 8.0 |
|
|
|
|
|
|
FieldSpec
A FieldSpec describes a single Field in a record, containing its name and DataType.
Schema
A Schema defines the structure of a record by specifying the fields it can contain.
A Schema is created from a List<FieldSpec>.
Schemas are immutable; once created, they cannot be changed.
Given a Schema, you can get a List<FieldSpec> that describes the Schema.
Field
A Field holds a value of a particular type and length, as specified by its FieldSpec.
Record
A Record is composed of Fields conforming to a Schema. Records are the mechanism by which information is exchanged between Tools of a project.
Records are constructed from Schemas. Your tool may build new Records using a supplied factory object, or it may be handed Records to populate. Once created, a Record's field values can be changed but its underlying Schema cannot be altered.
RecordCopier
When building a TransformTool or GeneralTool, it is common to build an output Schema that is either identical to an input Schema, or based on an input Schema with some additions and modifications. This is because it is common to create output Records that are mostly copies of the input Records, with some fields appended or modified. Use a RecordCopier to easily and efficiently copy a set of fields from an input Record to an output Record.
RecordCopier is much faster than copying field values one at a time!
Field values
A Field may have a valid value (for example an Integer field may contain the value 1234). In addition, a Field may have two special values:
Nulldesignates a missing value. This corresponds to theNULLvalue used by relational databases.Errordesignates an error value. This special value may be set to indicate conversion errors, calculation overflows, etc. It is an alternative to throwing exceptions or logging errors.
Tool
In Data Management, a Tool is a software component that reads, transforms, or writes data. Tools extend net.redpoint.dataflow.transform.Tool and implement one of the supported tool types specified by net.redpoint.dataflow.transform.*Tool interfaces.
Configuration
Tools may expose user-settable properties (for example, a file input Tool may expose a file path property). Tools may configure themselves by processing these user-settable properties.