Skip to main content
Skip table of contents

Definition Tab – File Analysis Panel

Overview

The Definition tab’s File Analysis panel is used to provide visibility of the results of RPI’s analysis of the high-level properties of the file you uploaded.

Within the panel, you can:

  • Make changes to the initial analysis results and invoke Re-analyze to observe the ramifications of your decisions in terms of the constitution of the data project’s fields.

  • If a delimited file, view a raw or parsed preview of the file’s contents.

  • If a fixed-width file, invoke a dialog within which you can specify field boundaries.

  • Initiate validation of the file against the data project’s definition, and load data into the data warehouse.

The File Analysis panel consists of the following elements:

File Analysis Section

This section contains a high-level synopsis of the file that was analyzed.

It contains the following:

  • File analyzed: the name of the file. Read-only.

  • File type: a drop-down list, set to one of Fixed-width or Delimited in accordance with RPI’s initial file analysis. Depending on the value to which File type is set, Delimited File Options or Fixed-width File Options are displayed. If RPI’s attempts to determine the file’s type proved inaccurate, you can override this setting manually. If you do so, you will need to invoke Re-analyze to cause the ramifications of your change to be displayed across the File and Field Analysis panels. Note that a single-column file can be treated as fixed-width or delimited.

Delimited File Options Section

This section presents a series of fields used to describe the high-level qualities of a delimited file.

  • Use delimiter: updateable. Set automatically during initial file analysis. Note that RPI will not be able to determine the delimiter if a custom character is used. A drop-down list exposes the following values:

    • Comma

    • Pipe

    • Tilde

    • Slash

    • Backslash

    • Semicolon

    • Colon

    • Space

    • Tab

    • Other

When you select Other, you must specify a single custom delimiter character in the field provided.

  • Has header row: an updateable checkbox; checked if all of the first row's fields contain string values, and at least one other row contains a non-string value.

  • Skip [n] row(s): updateable; this integer field is set to 1 if a header row is present, and to 0 if otherwise. It defines the number of initial rows within the file that are to be disregarded during file analysis.

  • Post-initial load: this updateable drop-down field is used to define how RPI will handle the loading of a data project’s second or subsequent files. Available options are:

    • Perform a complete table refresh

    • Insert Only (Ignore Duplicates)

    • Perform inserts and update existing records (default)

    • Only perform updates

    • Always Insert

    • Perform deletes & insert records

Note that if either “update” option, or “Perform deletes…” is selected and no key is defined within the Field Analysis panel, a validation error is raised.

  • Override table prefix: this checkbox is unchecked by default. When checked, the supplied Table prefix will be used. If unchecked, the value provided at system configuration setting DataProjectTablePrefixDefault will be used instead.

  • Table prefix: this mandatory property can be a maximum of 5 characters in length. It defaults to the value of system configuration setting DataProjectTablePrefixDefault, which, in turn, defaults to the value 'UD'. It allows you to specify a prefix to be applied to the name of the table to be created at data project execution.

  • Table name: the name of the table into which data is to be loaded defaults to the name of the data project. It can be a maximum of 50 characters in length and cannot contain database-invalid characters. A validation error is raised if these conditions are not met.

  • Enable field width extension: this checkbox is unchecked by default. When checked, it facilitates editing of the data projects' fields' Data Type, Size, and Scale properties on subsequent loads.

The following data type changes are supported:

  • Time to Datetime

  • Date to Datetime

  • Integer to Decimal

  • Decimal to Integer

  • Integer to String

  • Decimal to String

A validation error is raised when attempting to convert a data type to an incompatible value. A validation error is also raised when decreasing a field's length on a subsequent load.

If you make changes to one or more of Use delimiter, Has header row or Skip [n] row(s), you will need to invoke Re-analyze to observe the ramifications of your modifications within the File and Field Analysis panels.

Fixed-width File Options Section

This section presents a series of fields used to describe the high-level qualities of a fixed-width file.

Immediately following initial file analysis, the orange message shown above is displayed at the top of the section. The message is removed from display after invocation of Re-analyze. It will subsequently be redisplayed should field boundaries be changed again.

The following options are all populated during initial analysis:

Allow short lines: a checkbox, checked if RPI determines during initial file analysis that the final field in a fixed-width file contains data of differing lengths. It may be overridden manually if required.

If you make changes to Allow short lines, you will need to invoke Re-analyze to observe the ramifications of your modifications within the File and Field Analysis panels.

  • Post-initial load: this drop-down field if used to define how RPI will handle the loading of a data project’s second or subsequent file. Available options are:

    • Perform a complete table refresh

    • Insert Only (Ignore Duplicates)

    • Perform inserts and update existing records (default)

    • Only perform updates

    • Always Insert

    • Perform deletes & insert records

Note that if either “update” option is selected and no key is defined within the Field Analysis panel, a validation error will be raised.

  • Table name: the name of the table into which data is to be loaded defaults to the name of the data project. It can be a maximum of 50 characters in length and cannot contain database-invalid characters. A validation error is raised if these conditions are not met.

  • Enable field width extension: this checkbox is unchecked by default. When checked, it facilitates editing of the data projects' fields' Data Type, Size, and Scale properties on subsequent loads.

The following data type changes are supported:

  • Time to Datetime

  • Date to Datetime

  • Integer to Decimal

  • Decimal to Integer

  • Integer to String

  • Decimal to String

A validation error is raised when attempting to convert a data type to an incompatible value. A validation error is also raised when decreasing a field's length on a subsequent load.

Update Existing Table Options section

  • Update column: this dropdown property exposes the following values:

    • None (the default)

    • Add Column

    • Remove Column

It is disabled initially, and is enabled after the data project's first execution when the File type is Delimited and Has header row is set to true.

On second or subsequent data project execution:

  • When Update column is set to 'Add column', any new columns discovered in the source file will be added if do not already exist in the data project table.

  • When Update column is set to 'Remove column', [TBD – Mervin/Aira – if one or more data project table columns don't exist in the source file will be dropped]

Actions Panel

The Actions panel is shown to the right.

The panel contains the following buttons, which are always enabled.

  • Preview: shown only for delimited files. Displays the Preview dialog to provide visibility of the raw or parsed contents of the file (dialog covered separately). A warning is displayed at invocation if the current data project is not valid or is running.

  • Preview & Define: shown only for fixed-width files. Displays the Preview dialog to facilitate definition of the file’s field boundaries (dialog covered separately). A warning is displayed at invocation if the current data project is not valid or is running.

  • Re-analyze: if a delimited file, re-analyzes the file’s schema and content, taking into account any changes made within the File Analysis panel. A warning is displayed at invocation if the current data project is not valid, contains unsaved changes, or is running.

If a fixed-width file, also takes into account any changes made to field boundaries in the Preview dialog, and displays all columns in the Field Analysis panel, as well as enabling Execution.Validation. This option is protected by an “Are you sure?” dialog.

Validate and Load initiates the process of validating that the file is accordant with the data project’s definition, and, having done so, loads it. For a fixed-width file, it is disabled after subsequent field boundary changes, and enabled post-re-analysis. A warning is displayed at invocation if the current data project is not valid, contains unsaved changes, or is running.

Navigation Buttons

The navigation buttons provide an alternative way of navigating through the wizard-style sequence of data project definition process steps. They are displayed at the bottom right of the panel.

In the Definition tab’s File Analysis panel, the Back button is enabled, and displays the Acquisition panel. Clicking Next displays the Field Analysis panel.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.