Concepts and terms
This section introduces many of the concepts you'll encounter when working with the Redpoint Data Management SDK.
DataType
Data Management supports the following data types:
Integer
Float
Decimal
Boolean
Variable Text
Fixed-length Text
Unicode
Binary
Date
Time
DateTime
Spatial
Document
(version 8.2 and later)
A note about the Spatial data type
Currently, the SDK recognizes the RPDM Spatial
data type—but SDK tools can only manipulate Spatial
fields as byte[]
. This necessarily limits SDK tools to copying values. The ability to work directly with Spatial
fields will be added in a future release.
The SDK maps Data Management data types to Java types as follows.
Data Management type | Java type |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Not yet implemented in version 8.0 |
|
|
|
|
|
|
FieldSpec
A FieldSpec
describes a single Field
in a record, containing its name and DataType
.
Schema
A Schema
defines the structure of a record by specifying the fields it can contain.
A Schema
is created from a List<FieldSpec>.
Schemas
are immutable; once created, they cannot be changed.
Given a Schema, you can get a List<FieldSpec> that describes the Schema.
Field
A Field
holds a value of a particular type and length, as specified by its FieldSpec
.
Record
A Record
is composed of Field
s conforming to a Schema
. Records
are the mechanism by which information is exchanged between Tools of a project.
Records
are constructed from Schemas
. Your tool may build new Record
s using a supplied factory object, or it may be handed Record
s to populate. Once created, a Record
's field values can be changed but its underlying Schema
cannot be altered.
RecordCopier
When building a TransformTool
or GeneralTool
, it is common to build an output Schema
that is either identical to an input Schema
, or based on an input Schema
with some additions and modifications. This is because it is common to create output Records
that are mostly copies of the input Records, with some fields appended or modified. Use a RecordCopier
to easily and efficiently copy a set of fields from an input Record
to an output Record
.
RecordCopier
is much faster than copying field values one at a time!
Field values
A Field
may have a valid value (for example an Integer
field may contain the value 1234). In addition, a Field
may have two special values:
Null
designates a missing value. This corresponds to theNULL
value used by relational databases.Error
designates an error value. This special value may be set to indicate conversion errors, calculation overflows, etc. It is an alternative to throwing exceptions or logging errors.
Tool
In Data Management, a Tool is a software component that reads, transforms, or writes data. Tools extend net.redpoint.dataflow.transform.Tool
and implement one of the supported tool types specified by net.redpoint.dataflow.transform.*Tool
interfaces.
Configuration
Tools may expose user-settable properties (for example, a file input Tool may expose a file path property). Tools may configure themselves by processing these user-settable properties.