Configure RDBMS connection options
The RDBMS Input and Output tools both have Options tabs.
Not all options are available in both tools, and some options apply only to certain data connection types. In some cases you may need to select Override to view all options. The tables below lists the options and their meanings in each tool.
Option | Meaning |
---|
Use TRUNCATE TABLE to clear tables | If selected, clears tables using TRUNCATE TABLE instead of DELETE . In the RDBMS Input tool, this is only meaningful for temporary tables used in dynamic key query. In the RDBMS Output tool, this is used for the selected table as well as temporary tables used in update and delete. |
SQL quote | When a table or column name is a reserved word in SQL, or contains "special" characters, this delimiter character will be added when creating SQL scripts. Defaults to double-quote. |
SQL separator | When creating an SQL script in which both table and column are used, (or database and table), this character will be inserted between them. Defaults to period. |
Use scripting for create/drop operations | When automatically creating tables during execution (because they don't yet exist), use SQL scripting instead of the connector's programming interface. This is sometimes necessary because of flaws in the available database drivers. |
Quote mixed-case | Accommodates different database standards for handling names. Options are Default, Yes, No. |
Quote in API | Accommodates different database standards for handling table/column names. Options are Default, Yes, No. |
Option | Meaning |
---|
Read type (RDBMS Input) Insert type (RDBMS Output) | The optimal table loading interface may vary between connectors and databases.
For RDBMS Input tools, these settings apply only when temporary tables are being used for dynamic query.
Single record executes a single INSERT statement for every record. Block creates a parameterized INSERT statement, and executes that statement once for each block of records using an array binding. SQL Server fast load/SQL Bulk Operations is a special ODBC load/insertion method available for SQL Server, MySQL, Sybase, DB2 and PostgreSQL (with ODBC driver 0906 or later.) It is not compatible with the Use transaction block option. Note that ODBC providers have been deprecated in Data Management and ODBC drivers may not be supported in future versions. irowsetfastload is a special interface for SQL Server only. irowsetchange uses the OLDEB IRowChange interface, for Access only. Oracle Sql*Loader spawns the Sql*Loader program to perform inserts. The Oracle sqlldr.exe tool must be installed on the Execution Server. Typically this is available from an "Administrative" or "Custom" install of the Oracle client software, or from an Oracle server install. If available, this is typically the highest-performance loading option for Oracle. Oracle Direct Path uses the direct path loader interface for maximum load speed, especially when multiple RDBMS Output tools are used in parallel to load the table. Greenplum gpfdist uses Greenplum's parallel file distribution program to serve external table files to all Greenplum Database segments in parallel. The benefit of using gpfdist is that you are guaranteed maximum parallelism while reading from or writing to external tables, thereby offering the best performance as well as easier administration of external tables. Azure JDBC Loader writes records to one or more temporary data files that are uploaded to the configured Microsoft Azure Storage Area. Once all the bulk data files are uploaded (with or without optional compression), the data is inserted into the destination table as configured in the RDBMS Output tool. Temporary resources are cleaned up following successful completion of the data load, unless the Do not cleanup option is enabled. This is the preferred insert type for Azure Data Warehouse. Postgres JDBC Loader uses the psql utility's COPY DATA FROM STDIN command to send a character-stream version of the data to the RDBMS Output tool. This eliminates SQL overhead and results in faster table load times, but it is limited to a single operation. You cannot configure temp table delete and update operations within a single RDBMS Output tool that uses the bulk loader. Nor can you configure one RDBMS Output tool to perform temp table update using the bulk loader, and configure another RDBMS Output tool in the same project to use the bulk loader for temp table delete. In situations where you must use temp table update and delete within a single project, use the PostgreSQL bulk loader for the input source likely to have the higher volume of records, and use a conventional JDBC data connection for the other operation. Redshift JDBC Loader writes records in round-robin fashion to an array of temp files, then adds the files to a manifest which is used in the Redshift COPY operation to populate the specified database table. Snowflake JDBC Loader writes records to one or more temporary data files that are uploaded to the configured stage. Once all the bulk data files are uploaded, the data is inserted into the destination table as configured in the RDBMS Output tool. Temporary resources are cleaned up following successful completion of the data load, unless the Do not cleanup option is enabled. Teradata Fastload provides a method for quickly loading large amounts of data into an empty destination table in a Teradata database. JDBC FastLoad is recommended only for loading large amounts of data (at least 100,000 rows total). FastLoad requires that StatementInfo parcel support be enabled, and only supports batch inserts. Not all of the JDBC data types supported by the Teradata JDBC driver are supported by FastLoad; for example, BLOB and CLOB are not supported. Similarly, not all of the JDBC data type conversions supported by the Teradata JDBC Driver are supported by FastLoad. JDBC SQL Server Bulk Copy is an efficient bulk loader for inserting large files into tables or views in SQL Server databases. It requires the Microsoft JDBC driver for SQL Server version 7.4.1 or higher to work properly. In general, this option is not recommended; it is only suitable for MSSQL and Azure SQL.
|
Block size | The number of rows to retrieve in each Fetch operation, or the number of rows to INSERT with each transaction. Increasing Block Size may improve performance, but will increase the tool's memory requirements. With JDBC data connections, specifying a large block size for a table whose row size is large may require you to increase your JVM memory setting. |
JDBC block size | Improves internal efficiency of JDBC inserts and reads. Should be smaller than Block size. Increasing JDBC block size may improve performance, but will increase the tool's memory requirements. The incremental performance improvement diminishes sharply at larger values. The default setting of 10 provides good efficiency. |
Transaction per block | Surrounds a block INSERT statement with BEGIN TRANSACTION/COMMIT . This may speed up database inserts considerably, as well as updates and deletes when temporary files are used. This setting is ignored for table reads and non-dynamic queries. |
Record binding | Indicates whether the array of data exchanged between the database and Data Management is ordered by rows or by columns. This is determined by the database; each one has a preferred binding order. Using the predefined database provider templates is the best way to ensure that the correct insertion methods and binding orders are chosen. JDBC data connections ignore this setting. |
Max value size | Configure this option for ODBC connections, when reading unbounded field types such as BLOB , CLOB , TEXT , or varchar(max) , or if your database always shows large fields on query. If you don't want to enable Stream objects for performance reasons and you know that the largest possible data value is under some reasonable size, set this option to the upper bound of the expected data size. ODBC providers have been deprecated in Data Management, and the ODBC drivers may not be supported in future versions. |
Use threading | Allows some database processing to be done in the background, which can reduce the elapsed time for large insert operations. This option is ignored for chunk update/delete operations. |
Use transaction block | Surrounds a block INSERT statement with BEGIN TRANSACTION/COMMIT . This may speed up database inserts considerably. |
Option | Meaning |
---|
Treat Unicode as Latin-1 | RDBMS Input: Converts Unicode (for example, nvarchar ) text stored in database to type TextVar , and replaces non-representable characters with a question mark character (? ). RDBMS Output: When a table is created or a table creation script is generated, columns are created with type varchar instead of nvarchar . When the table is loaded, any varchar columns that are loaded with Data Management Unicode fields will have non-convertible characters replaced with a question mark character (? ). |
Treat empty text as null when loading | If selected, empty text strings will insert as NULL , otherwise they will insert as empty strings. Some databases don't allow the insertion of empty strings that are not NULL . |
Stream objects | Some database objects (such as BLOB s and CLOB s) can be so large that bulk load interfaces fail. If this option is selected, records containing objects larger than the size specified in Larger than are streamed rather than bulk-loaded. |
Larger than | Threshold for streaming large database objects. |
Option | Meaning |
---|
Connection string | Writes the connection string (command used to connect to a database) to the log. Useful for troubleshooting connection problems. |