DBF Input/Output

DBF Input

The DBF Input tool reads a DBF file and outputs records from the file to the "downstream" tools. DBF files share the performance advantages of flat files and CSV files. However, DBF files are "self-configuring"—record layout, including field sizes, field types, and field names, is defined in the file header.

For ease of configuration, high performance, and portability, DBF is the preferred file format. However, if you are creating a file to be read only by Data Management, consider using the DLD (Data Management Data) file format instead. It is self-configuring, faster than DBF, and handles all Data Management data types without conversions.

DBF Input tool configuration parameters

The DBF Input tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter	Description
Input file	The file containing the records, or a wildcard pattern matching multiple files.
Text fields are variable length	Clear this if you want text fields within the input file to be treated as fixed-length (retaining trailing spaces).

Options

Parameter	Description
Limit records	If selected, limits the number of records read.
Read only the first	If Limit records is selected, specifies the number of records to be read.
Produce file name field	If selected, the file name will be output as a record field.
Output full path	If Produce file name field is selected, optionally outputs the entire path to the record file name field.
Output URI path	If Output full path is selected, express path as a Uniform Resource Identifier (URI).
Field name	If Produce file name field is selected, name of the column to be used for the file name. This is optional and defaults to `FILENAME`.
Field size	If Produce file name field is selected, size of the field to be used for the file name. This is optional and defaults to `255`.

Configure the DBF Input tool

Select the DBF Input tool.
Choose the Configuration tab.
Specify the Input file. Once you specify a file, Data Management displays a sample of the input data, automatically detecting field definitions and record layout. Field definitions and details are read-only; you cannot edit them.
If you want text fields within the input file to be treated as fixed-length (retaining trailing spaces), clear the Text fields are variable-length box.
Optionally, select the Options tab and configure advanced options:
- If you don't want to process the entire file, select Limit records and type the desired number of records to process.
- To include the name of the input file as a new field, select Produce file name field and specify a Field name and Field size. Select Output full path to include the complete file specification. This can be useful when reading a wildcarded set of files. Select Output URI path to express the complete file specification as a Uniform Resource Identifier.
Optionally, go to the Execution tab and Enable trigger input, configure reporting options, or set Web service options.

DBF Output

The DBF Output tool writes data into DBF format. DBF files are strongly-typed and are "self-configuring"—record layout, including field sizes, field types, and field names, is defined in the file header. The DBF format is a good choice for moving data between applications.

DBF Output tool configuration parameters

The DBF Output tool has one set of configuration parameters in addition to the standard execution options.

Parameter	Description
Output file	The output file name.
Open file time	Specifies when the output file will be opened: Default Uses the site/execution server setting When project is started When first record is read When last record is read
Write empty file if no records are read	If selected, writes an output file even when no records are read. This is unavailable if Open file time is When project is started.
Split files	If selected, splits the output file by Size, Record count, or Data.
Split size (MB)	If Split files by size is selected, specifies the maximum size of the split files. Output file names are appended with a sequence number between the file root and the extension. Defaults to 1024 MB (1Gb).
Split count	If Split files by record count is selected, specifies the maximum number of records in the split files. Output file names are appended with a sequence number between the file root and the extension. Defaults to 1,000,000.
Split field	If Split files by data is selected, name of the field to be used to split the data. A separate file will be created for each unique value of the specified field. Data must be grouped by the split field.
Suppress split field	If selected, the Split field is omitted from files created using Split files by data.
Treat output file as folder	If selected, the Split field value is used as file name for files created using Split files by data.
Replication factor	Number of copies of each block that will be stored (on different nodes) in the distributed file system. The default is 1.
Block size (MB)	The minimum size of a file division. The default is 128 MB.

Configure the DBF Output tool

Select the DBF Output tool.
Go to the Configuration tab on the Properties pane.
Specify the Output file.
Optionally, specify Open file time.

Parameter	Description
Default	Use the site/execution server setting. If you select this, you can optionally select Write empty file if no records are read. A warning will be issued if the tool setting conflicts with the site/execution server setting.
When project is started	Open output file when the project is run.
When the first record is read	Open output file when the first record is read. If you select this, you can optionally select Write empty file if no records are read.
After the last record is read	Output records are cached and not written to the output file until the tool receives the final record. If you select this, you can optionally select Write empty file if no records are read.

Optionally, you can split the output file into smaller, more manageable pieces. Check the Split files box, and then select By size or By data.
- If you select Split files by size, select Split size and then specify the maximum file size (in megabytes). The resulting output files will have the name you specified, with a sequential number appended.
- If you select Split files by data, select the desired Split field name from the drop-down list. Data must be grouped by the split field. The resulting output files will have the name you specified, augmented by inserting the value of the specified field before the extension. For example, splitting output by ZIP Code produces file names of the form output_file01234.csv.
- To generate file names where the entire name is determined by the value of the specified field, select Treat output file as folder and specify the output directory in the Output file box, using the form: F:\output_directory.
- If you do not want the specified field to appear in the output, select Suppress split field on output.
Optionally, go to the Execution tab and Enable trigger input, configure reporting options, or set Web service options.

The Split into several files option is useful when you want to write a file to media (ZIP disk, CD-ROM, DVD) for physical transport to someone else and the raw file is too large for a single disk. Set the maximum file size to equal the capacity of the target media.

You can configure a single DBF Input tool to read a sequence of files with the same layout and format. This is an easy way to reassemble a file that's been split as described above. Since Data Management always reads wildcard input files in alphabetical file name order, they are guaranteed to be reassembled in order.