Excel tools
Data Management version 9.4 replaced the deprecated Microsoft Excel OleDB/ODBC data providers with dedicated Excel Input and Excel Output tools. If you have projects containing RDBMS tools that are configured to use an Excel data provider, you should manually edit these projects to use the Excel Input and Output tools.
Limitations:
The Excel Input and Output tools support a small subset of the functionality available in Excel, and are not suitable for generating reports in Excel.
The tools can read and write .xlsx format files generated by Excel 2007 and later. Older file formats are not supported.
The Excel Output tool only supports overwriting workbooks. You cannot append to an existing worksheet or add a new worksheet to an existing workbook.
Excel data types
Data Management data types translate to the following Microsoft Excel data types:
Data Management type | Excel type | Notes |
---|---|---|
Boolean | Boolean | |
Date | Numeric | |
Time | Numeric | |
DateTime | Numeric | |
Decimal | Numeric | |
Float | Numeric | |
Integer | Numeric | |
Text | String |
When the cell contains a formula, the record schema should specify a data type compatible with the resulting cell type.
Inferred date and time types
Excel does not define dedicated cell types for dates, times, and datetimes. Instead, all date/time values are stored as doubles. Date/time values are inferred from the cell format.
Fixed and floating point conversions
When configuring fixed-point numeric types like Decimal in Data Management, be aware that all numeric cells in Excel are stored as double-precision floating-point numbers. All the standard caveats when converting between fixed and floating point numbers apply.
Excel's General format
The General format is Excel's default format for all cells. Numbers entered into Excel are formatted as General and displayed just the way you type them. However, all numeric cells in Excel are stored as double-precision floating-point numbers. If you map an Excel field formatted as General to a Text type in an Excel Input tool, Data Management will coerce numeric data into strings, adding decimals and using scientific (exponential) notation for large numbers (12 or more digits).
Excel Input
To configure an Excel Input tool, you need the following information about the input data:
Name of the Workbook file and Worksheet.
The list of fields and their data types.
The number of lines (usually header rows) to skip, if any.
When you configure the Excel Input tool, you define a record schema that describes the field names and field types contained in the selected worksheet. When the workbook is read, an <<ERROR>>
value will be emitted for any cells that cannot be converted to the specified data type.
Excel Input tool configuration parameters
The Excel Input tool has a single set of configuration parameters in addition to the standard execution options:
Workbook | The file containing the Microsoft Excel workbook. |
Worksheet | The worksheet containing the records. |
Skip rows | If defined, skip leading lines in the file, such as field names contained in a first line. This is optional and defaults to 0. |
Field | Name that uniquely names the field among all records in the worksheet. |
Type | Type of the field. See Excel data types. |
Enable trigger input |
Configure the Excel Input tool
Select the Excel Input tool, then select the Configuration tab.
Specify the Workbook.
Select Refresh worksheets, and then select the desired Worksheet from the list.
If the worksheet contains header rows, select Skip rows and enter the number of rows to skip.
For each field in the worksheet:
Select the cell in the Field column, and type the desired name.
Select the cell in the Type column, and specify the field's data type.
Optionally, Enable trigger input.
Optionally, go to the Execution tab and configure reporting options or set Web service options.
Excel Output
The Excel Output tool writes records into a Microsoft Excel workbook. The tool only supports overwriting workbooks. You cannot append to an existing worksheet or add a new worksheet to an existing workbook.
On some Linux systems, you may encounter the following error when using the Excel Output tool:
Caused by: java.lang.NullPointerException
sun.awt.FontConfiguration.getVersion(FontConfiguration.java:1264)
sun.awt.FontConfiguration.readFontConfigFile(FontConfiguration.java:219)
sun.awt.FontConfiguration.init(FontConfiguration.java:107)
sun.awt.X11FontManager.createFontConfiguration(X11FontManager.java:774)
...
This is because:
Excel considers font geometry when calculating worksheet layouts
On Linux, the JRE requires Freetype and Fontconfig to use fonts
Font support is typically included only for Linux systems that install a desktop environment
Consult your Linux distribution’s documentation to install font support, but generally:
Ubuntu:
apt-get install libfreetype6 fontconfig fonts-dejavu
CentOS:
yum install freetype fontconfig dejavu-sans-fonts
Excel Output tool configuration parameters
The Excel Output tool has one set of configuration parameters in addition to the standard execution options:
Workbook | The Microsoft Excel file to write. |
Write headers | If selected, use the field names of the input record to write a header row. |
Enable trigger output |
Configure the Excel Output tool
Select the Excel Output tool, and then go to the Configuration tab on the Properties pane.
Specify the Workbook.
Optionally, select Write headers to use the field names of the input record to write a header row.
Optionally, Enable trigger input.
Optionally, go to the Execution tab and configure reporting options or set Web service options.