Definition Tab – File Analysis Panel
Overview
The Definition tab’s File Analysis panel is used to provide visibility of the results of RPI’s analysis of the high-level properties of the file you uploaded.
Within the panel, you can:
Make changes to the initial analysis results and invoke Re-analyze to observe the ramifications of your decisions in terms of the constitution of the data project’s fields.
If a delimited file, view a raw or parsed preview of the file’s contents.
If a fixed-width file, invoke a dialog within which you can specify field boundaries.
Initiate validation of the file against the data project’s definition, and load data into the data warehouse.
The File Analysis panel consists of the following elements:
File Analysis Section
This section contains a high-level synopsis of the file that was analyzed.

It contains the following:
File analyzed: the name of the file. Read-only.
File type: a drop-down list, set to one of Fixed-width or Delimited in accordance with RPI’s initial file analysis. Depending on the value to which File type is set, Delimited File Options or Fixed-width File Options are displayed. If RPI’s attempts to determine the file’s type proved inaccurate, you can override this setting manually. If you do so, you will need to invoke Re-analyze to cause the ramifications of your change to be displayed across the File and Field Analysis panels. Note that a single-column file can be treated as fixed-width or delimited.
Delimited File Options Section
This section presents a series of fields used to describe the high-level qualities of a delimited file.

Use delimiter: updateable. Set automatically during initial file analysis. Note that RPI will not be able to determine the delimiter if a custom character is used. A drop-down list exposes the following values:
Comma
Pipe
Tilde
Slash
Backslash
Semicolon
Colon
Space
Tab
Other
When you select Other, you must specify a single custom delimiter character in the field provided.
Has header row: an updateable checkbox; checked if all of the first row's fields contain string values, and at least one other row contains a non-string value.
Skip [n] row(s): updateable; this integer field is set to 1 if a header row is present, and to 0 if otherwise. It defines the number of initial rows within the file that are to be disregarded during file analysis.
Post-initial load: this updateable drop-down field is used to define how RPI will handle the loading of a data project’s second or subsequent files. Available options are:
Perform a complete table refresh
Insert Only (Ignore Duplicates)
Perform inserts and update existing records (default)
Only perform updates
Always Insert
Perform deletes & insert records
Note that if either “update” option, or “Perform deletes…” is selected and no key is defined within the Field Analysis panel, a validation error is raised.
Override table prefix: this checkbox is unchecked by default. When checked, the supplied Table prefix will be used. If unchecked, the value provided at system configuration setting DataProjectTablePrefixDefault will be used instead.
Table prefix: this mandatory property can be a maximum of 5 characters in length. It defaults to the value of system configuration setting DataProjectTablePrefixDefault, which, in turn, defaults to the value 'UD'. It allows you to specify a prefix to be applied to the name of the table to be created at data project execution.
Table name: the name of the table into which data is to be loaded defaults to the name of the data project. It can be a maximum of 50 characters in length and cannot contain database-invalid characters. A validation error is raised if these conditions are not met.
Enable field width extension: this checkbox is unchecked by default. When checked, it facilitates editing of the data projects' fields' Data Type, Size, and Scale properties on subsequent loads.
The following data type changes are supported:
Time to Datetime
Date to Datetime
Integer to Decimal
Decimal to Integer
Integer to String
Decimal to String
A validation error is raised when attempting to convert a data type to an incompatible value. A validation error is also raised when decreasing a field's length on a subsequent load.
If you make changes to one or more of Use delimiter, Has header row or Skip [n] row(s), you will need to invoke Re-analyze to observe the ramifications of your modifications within the File and Field Analysis panels.
Fixed-width File Options Section
This section presents a series of fields used to describe the high-level qualities of a fixed-width file.

Immediately following initial file analysis, the orange message shown above is displayed at the top of the section. The message is removed from display after invocation of Re-analyze. It will subsequently be redisplayed should field boundaries be changed again.
The following options are all populated during initial analysis:
Allow short lines: a checkbox, checked if RPI determines during initial file analysis that the final field in a fixed-width file contains data of differing lengths. It may be overridden manually if required.
If you make changes to Allow short lines, you will need to invoke Re-analyze to observe the ramifications of your modifications within the File and Field Analysis panels.
Post-initial load: this drop-down field if used to define how RPI will handle the loading of a data project’s second or subsequent file. Available options are:
Perform a complete table refresh
Insert Only (Ignore Duplicates)
Perform inserts and update existing records (default)
Only perform updates
Always Insert
Perform deletes & insert records
Note that if either “update” option is selected and no key is defined within the Field Analysis panel, a validation error will be raised.
Table name: the name of the table into which data is to be loaded defaults to the name of the data project. It can be a maximum of 50 characters in length and cannot contain database-invalid characters. A validation error is raised if these conditions are not met.
Enable field width extension: this checkbox is unchecked by default. When checked, it facilitates editing of the data projects' fields' Data Type, Size, and Scale properties on subsequent loads.
The following data type changes are supported:
Time to Datetime
Date to Datetime
Integer to Decimal
Decimal to Integer
Integer to String
Decimal to String
A validation error is raised when attempting to convert a data type to an incompatible value. A validation error is also raised when decreasing a field's length on a subsequent load.
Update Existing Table Options section
Update column: this dropdown property exposes the following values:
None (the default)
Add Column
Remove Column
It is disabled initially, and is enabled after the data project's first execution when the File type is Delimited and Has header row is set to true.
On second or subsequent data project execution:
When Update column is set to 'Add column', any new columns discovered in the source file will be added if do not already exist in the data project table.
When Update column is set to 'Remove column', [TBD – Mervin/Aira – if one or more data project table columns don't exist in the source file will be dropped]
Actions Panel
The Actions panel is shown to the right.

The panel contains the following buttons, which are always enabled.
Preview: shown only for delimited files. Displays the Preview dialog to provide visibility of the raw or parsed contents of the file (dialog covered separately). A warning is displayed at invocation if the current data project is not valid or is running.
Preview & Define: shown only for fixed-width files. Displays the Preview dialog to facilitate definition of the file’s field boundaries (dialog covered separately). A warning is displayed at invocation if the current data project is not valid or is running.
Re-analyze: if a delimited file, re-analyzes the file’s schema and content, taking into account any changes made within the File Analysis panel. A warning is displayed at invocation if the current data project is not valid, contains unsaved changes, or is running.
If a fixed-width file, also takes into account any changes made to field boundaries in the Preview dialog, and displays all columns in the Field Analysis panel, as well as enabling Execution.Validation. This option is protected by an “Are you sure?” dialog.
Validate and Load initiates the process of validating that the file is accordant with the data project’s definition, and, having done so, loads it. For a fixed-width file, it is disabled after subsequent field boundary changes, and enabled post-re-analysis. A warning is displayed at invocation if the current data project is not valid, contains unsaved changes, or is running.
Navigation Buttons
The navigation buttons provide an alternative way of navigating through the wizard-style sequence of data project definition process steps. They are displayed at the bottom right of the panel.

In the Definition tab’s File Analysis panel, the Back button is enabled, and displays the Acquisition panel. Clicking Next displays the Field Analysis panel.