Stream to Table and Table to Stream

Stream to Table

The Stream to Table (StreamToTable) tool is the logical complement of Table to Stream (TableToStream). It reads data in "cell value" form where each datum has its own record, and produces a structured and typed output record. It processes an input stream containing three fields:

ID: identifies each logical record.
FIELD: is the name of the field that the data belongs to.
VALUE: is the field value expressed as text.

For example, it can accept the following input data.

ID	FIELD	VALUE
1	NAME	FRED SMITH
1	ADDRESS	123 MAIN ST
1	PHONE	(101)111-2222
2	NAME	MARY ROGERS
2	ADDRESS	PO BOX 888
2	PHONE	123-456-7890

And produce this output data.

ID	NAME	ADDRESS	PHONE
1	FRED SMITH	123 MAIN ST	(101)111-2222
2	MARY ROGERS	PO BOX 888	123-456-7890

To have the record definition mirror that of an existing data connection, connect the connection to the "M" (meta) input of the Stream to Table tool.

The Stream to Table tool, used in conjunction with TableToStream, is very useful for performing meta-operations, where the exact schema is not known until runtime. It allows operations to be performed across many or all fields. See the sample project table_to_stream_and_back for an example of how to use this tool.

Stream to Table tool configuration parameters

The Stream to Table tool has a single set of configuration parameters in addition to the standard execution options.

Parameter	Description
ID field	The input field containing the record ID.
Field-name field	The input field containing the field name.
Value field	The input field containing the field data.
Include ID field on output	If selected, copies the input ID field value to an output field of the same name. There must not already be an output field of that name.
Default numbers to zero instead of null	If selected, sets missing numeric field values to zero.
Do not sort by ID field	If selected, disables sorting on ID field. The Stream to Table tool requires sorted input to work correctly, and by default sorts input by the record ID. If your input is grouped by record ID but not globally sorted by the record ID, this option may offer a minor speed improvement. However, you should generally leave this off.
Schema from	Specifies how the output schema is defined. One of: Defined below: the schema is defined in the Fields section of the configuration. This is the default in version 7.2 and later. Meta input: the schema is defined by the schema of the connection on the Meta ("M") input. Respository: the schema is defined in the repository schema object specified by Schema. Layout file: the schema is defined in the layout defined by Layout file.
Fields	The Fields grid displays the schema definition: Name uniquely names the field. Type described the data type and size.

Configure the Stream to Table tool

Connect the ID/FIELD/VALUE input stream to the left (data) connector.
Select the Stream to Table tool.
Go to the Configuration tab on the Properties pane.
Specify the ID field, Field-name field, and Value input fields.
Optionally, select Include ID field on output to copy the input ID field to an output field of the same name. If this option is enabled, there must not already be an output field of that name defined.
Optionally, select Default numbers to zero instead of null if some numeric field values may be missing, and you want those to be zero.
Optionally, you may select Do not sort by ID field if your input is grouped by ID field but not globally sorted by ID field. You should generally leave this turned off.
Select Schema from and select a method to define the output record Fields.

If you have...	Do this
An existing data connection with the desired schema.	Select Meta input and select Commit to show the M input, and then connect the data source to the M (meta) input.
A CSV or TXT schema for this data already defined in the repository.	Select Use repository schema and choose the schema.
A layout file.	Select Layout file and browse to the desired layout file.
None of the above.	Select Defined below and use the grid to define Field names and Type manually.

Optionally, go to the Execution tab, and then set Web service options.

Notes:

The ID field values need not be unique, sequential, or numeric.
Only adjacent runs of the same ID field value are counted as the same record. If you want noncontiguous sets to be treated globally, sort by ID first.
Missing values will result in either Null or zero values on the output record, depending on the setting of the Default numbers to zero instead of null option.
If there is duplicate ID/FIELD data, the last value prevails.

For example, if the input stream is:

ID	FIELD	VALUE
1	NAME	FRED SMITH
1	NAME	JOHN BROWN
1	PHONE	(101)111-2222

The output record will be:

ID	NAME	PHONE
1	JOHN BROWN	(101)111-2222

Table to Stream

The Table to Stream tool is the logical complement of Stream To Table tool. It converts each cell of an input table into a new record containing three columns:

ID: unique identifier of the record. This can either be an existing unique ID, or a unique number generated by the Table To Stream tool. You can specify this to be something other than ID.
FIELD: the name of the field.
VALUE: the value of the field, converted to text.

This tool is particularly useful for the following tasks.

Performing the same calculation on every field

Because all columns are represented identically, a single operation on the VALUE field can change all columns at once. However, you are limited to computations that make sense for the original data type. See the sample project table_to_stream_and_back for an example of how to use this tool in conjunction with Stream To Table to perform calculations on multiple columns at once.

Data profiling

The Table to Stream tool can be used with the Summarize tool to generate profiles (such as counts, null counts, top N most common values) of multiple columns simultaneously.

Table to Stream tool configuration parameters

The Table to Stream tool has a single configuration parameter in addition to the standard execution options.

Parameter	Description
ID field	If specified, the field to use as the record ID. If not specified, a sequential number will be generated.

Configure the Table to Stream tool

Select the Table to Stream tool.
Go to the Configuration tab on the Properties pane.
Optionally, select ID field, and then select the field you wish to use as the record ID.
Optionally, go to the Execution tab, and then set Web service options.