Skip to main content
Skip table of contents

Advanced topics

Refreshing connections

This feature is available starting with DataManagement version 9.2.1.

This feature is designed to assist SDK tools that cache information from a service or database. For example, a database-loading tool may query the target table schema, and in order to avoid the overhead reading this all the time, may cache the previous result. However, this can lead to issues when the target table changes. To handle this situation, override the refreshConnection() method of the Tool base class and clear any such cached information.

CODE
	@Override 
	public void refreshConnection() {
		tableSchema = null;
	}

Writing tools that target Hadoop

The HDFS temp folder

If you want to write temp files into HDFS (or other Hadoop filesystem), use the URL from: System.getProperty(“rpdm.hadoop.tmpdir”).

Tools that appear in the Parallel Section input and output bars

DEFERRED

Normally, SDK tool appear in the palette and can be dragged onto the canvas at design time. However, you can create a special type of SDK tool that appears in the Parallel Section input or output bars instead. This requires some special design consideration to make everything work correctly, and is not a simple endeavor.

Input tool

To make an SDK tool that will appear in the Parallel Section’s input bar and menu, your tool class must implement IsParallelSectionInput and its methods.

CODE
class MyInputTool extends GeneralTool implements IsParallelSectionInput {

}

Integrating with RPDM’s Hadoop security

If you are writing a tool that interfaces with Hadoop, you might want to take advantage of the fact the RPDM has already dealt with the security issues of authenticating with Hadoop, renewing tokens, delegating tokens to YARN tasks, and dealing with the differences between Hadoop vendors and versions. This integration doesn’t happen automatically, however, because by default every SDK tool jar exists in its own classloader space, meaning that it cannot share the UserGroupInformation singleton managed by RPDM.

Specify the Hadoop “base module”

To integrate with RPDM’s Hadoop security, you should first create a resource file in your project named /META-INF/plugin_properties/plugin.properties. In this file, add a single line containing base_module=rpdmhadoop1.

In the Netbeans IDE this looks like the following image.

Doing so makes your SDK tool’s ClassLoader “inherit” from the ClassLoader used by RPDM’s Hadoop module, making things like the Hadoop Configuration and FileSystem available.

Make all Hadoop dependencies “provided” scope in pom.xml

If your SDK jar is built using maven shading, this makes your tool to pick up the Hadoop jars of the defined cluster, rather than whatever jars are referenced as dependencies.

Example:

CODE
	<dependency>
		<groupId>org.apache.hive</groupId>
		<artifactId>hive-jdbc</artifactId>
		<version>${hive-jdbc-version}</version>
		<scope>provided</scope>
	</dependency>
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.