Skip to main content
Skip table of contents

Excel tools

Overview

Data Management version 9.4 replaced the deprecated Microsoft Excel OleDB/ODBC data providers with dedicated Excel Input and Excel Output tools. If you have projects containing RDBMS tools that are configured to use an Excel data provider, you should manually edit these projects to use the Excel Input and Output tools.

Limitations:

  • The Excel Input and Output tools can read and write .xlsx format files generated by Excel 2007 and later.

  • NEW IN 9.6.1 Pre-2007 Excel .xls format is only supported by the Excel Input tool for reading; it is not supported by the Excel Output tool for writing.

  • The Excel Input and Output tools support a small subset of the functionality available in Excel, and are not suitable for generating reports in Excel.

  • The Excel Output tool only supports overwriting workbooks. You cannot append to an existing worksheet or add a new worksheet to an existing workbook.

Excel data types

Data Management data types translate to the following Microsoft Excel data types.

Data Management type

Excel type

Notes

Boolean

Boolean

N/A

Date

Numeric

See Inferred date and time types.

Time

Numeric

See Inferred date and time types.

DateTime

Numeric

See Inferred date and time types.

Decimal

Numeric

See Fixed and floating point conversions.

Float

Numeric

N/A

Integer

Numeric

See Fixed and floating point conversions.

Text

String

See Excel's General format.

When the cell contains a formula, the record schema should specify a data type compatible with the resulting cell type.

Inferred date and time types

Excel does not define dedicated cell types for dates, times, and datetimes. Instead, all date/time values are stored as doubles. Date/time values are inferred from the cell format.

Fixed and floating point conversions

When configuring fixed-point numeric types like Decimal in Data Management, be aware that all numeric cells in Excel are stored as double-precision floating-point numbers. All the standard caveats when converting between fixed and floating point numbers apply.

Excel's General format

The General format is Excel's default format for all cells. Numbers entered into Excel are formatted as General and displayed just the way you type them. However, all numeric cells in Excel are stored as double-precision floating-point numbers. If you map an Excel field formatted as General to a Text type in an Excel Input tool, Data Management will coerce numeric data into strings, adding decimals and using scientific (exponential) notation for large numbers (12 or more digits).

Excel Input

To configure an Excel Input tool, you need the following information about the input data:

  • Name of the Workbook file and Worksheet.

  • The list of fields and their data types.

  • The number of lines (usually header rows) to skip, if any.

When you configure the Excel Input tool, you define a record schema that describes the field names and field types contained in the selected worksheet. When the workbook is read, an <<ERROR>> value will be emitted for any cells that cannot be converted to the specified data type.

Excel Input tool configuration parameters

The Excel Input tool has a single set of configuration parameters in addition to the standard execution options.

Parameter

Description

Workbook

The file containing the Microsoft Excel workbook.

Worksheet

The worksheet containing the records.

Skip rows

If defined, skip leading lines in the file, such as field names contained in a first line. This is optional and defaults to 0.

Field

Name that uniquely names the field among all records in the worksheet.

Type

Type of the field. See Excel data types.

Enable trigger input

See Trigger input and output.

Starting cell

NEW IN 9.6.1

The cell from which the Excel Input tool begins reading; any preceding rows/columns are ignored. The default is A1; accepts strings in the format [A-Z]{1,}[0-9]{1,}.

Large file support

NEW IN 9.6.1

When enabled, allows the Excel Input tool to read large workbooks without requiring additional memory via the project’s JVM heap property. However, consider the following limitations:

  • Any formula (designated by a leading equal sign character) encountered in a cell is emitted as an <<Error>> value

  • Only .xlsx files can be read

  • The Excel Input tool cannot convert Excel’s Date, Time, or DateTime cells to their corresponding Data Management types. Cell values must conform to the format specified in Excel’s Format Cells dialog for dates and times. If you cannot ensure cell values in your workbooks follow these formats, the Excel Input tool will emit them as <<Error>> values. In this case, set the field type to Text in the Excel Input tool, and parse the date in a downstream Calculate tool.

Large file support is disabled by default.

Configure the Excel Input tool

  1. Select the Excel Input tool.

  2. Select the Configuration tab.

  3. Specify the Workbook.

  4. Select Refresh worksheets.

  5. Select the desired Worksheet from the list.

  6. If the worksheet contains header rows, select Skip rows and enter the number of rows to skip.

  7. For each field in the worksheet:

    • Select the cell in the Field column, and type the desired name.

    • Select the cell in the Type column, and specify the field's data type.

      image-20240326-201344.png

  8. Optionally, Enable trigger input.

  9. Optionally, go to the Execution tab and configure reporting options or set Web service options.

Excel Output

The Excel Output tool writes records into a Microsoft Excel workbook. The tool only supports overwriting workbooks. You cannot append to an existing worksheet or add a new worksheet to an existing workbook.

On some Linux systems, you may encounter the following error when using the Excel Output tool.

CODE
Caused by: java.lang.NullPointerException
    sun.awt.FontConfiguration.getVersion(FontConfiguration.java:1264)
    sun.awt.FontConfiguration.readFontConfigFile(FontConfiguration.java:219)
    sun.awt.FontConfiguration.init(FontConfiguration.java:107)
    sun.awt.X11FontManager.createFontConfiguration(X11FontManager.java:774)
    ...


This is because:

  • Excel considers font geometry when calculating worksheet layouts

  • On Linux, the JRE requires Freetype and Fontconfig to use fonts

  • Font support is typically included only for Linux systems that install a desktop environment

Consult your Linux distribution’s documentation to install font support, but generally:

  • Ubuntu: apt-get install libfreetype6 fontconfig fonts-dejavu

  • CentOS: yum install freetype fontconfig dejavu-sans-fonts

Excel Output tool configuration parameters

The Excel Output tool has one set of configuration parameters in addition to the standard execution options.

Parameter

Description

Workbook

The Microsoft Excel file to write.

Write headers

If selected, use the field names of the input record to write a header row.

Enable trigger output

See Trigger input and output.

Write empty file if no records are read

NEW IN 9.6.1

Determines whether or not to create an empty file if no records are received by the Excel Output tool:

  • When enabled (default), an empty file will be created even if no records are received by the Excel Output tool’s input connector.

  • When disabled, a file will only be created when records are received by the Excel Output tool’s input connector.

This option is enabled by default to preserve backwards compatibility with existing projects.

Configure the Excel Output tool

  1. Select the Excel Output tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Specify the Workbook.

  4. Optionally, select Write headers to use the field names of the input record to write a header row.

  5. Optionally, Enable trigger input.

  6. Optionally, go to the Execution tab and configure reporting options or set Web service options.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.