Run projects

Overview

You can run Data Management projects in a variety of modes:

Run projects in the client

To run an open project in the client, select the green arrow on the status bar.

You can also run a project by choosing Run from the Project menu, by pressing Ctrl+R, or from the Run Control.

To cancel a running project in the client, select the Cancel button on the Status bar.

View embedded projects and macros

Projects and macros can be embedded within other projects. Use the View command on the project shortcut menu to monitor these sub-projects as they run.

To view embedded projects and macros as they run:

Open the project containing the embedded project or macro.
Right-click the embedded project or macro, and then select View Macro.

A new window displays the embedded project or macro. Note that the embedded project or macro view is read-only.

Data Viewer tools that are embedded in projects or macros are disabled by default. You can enable hidden Data Viewers in project settings.

Command-line processing

While Data Management's graphical user interface is an intuitive way to create data processing projects, there are times when you may want to run Data Management from the command line. For example, suppose you have a three-step process that you want to run every night:

Use the RDBMS Input tool to extract data from a Microsoft SQL Server database.
Use Data Management to process the records and perform duplicate removal.
Perform address standardization on the results using a third-party postal application.

Assuming that all of the applications are executable from the command-line, these steps can be easily written as a simple batch file. You can use Data Management's command-line processing utility rpdm_cmd.exe to integrate your Data Management projects into batch processes and schedule them for automatic execution.

Run projects from the command line

You can run the Data Management command-line using either a project stored on a local computer, or a project obtained from the repository. Projects run in a non-interactive manner can be monitored in the Management dashboard.

Syntax

rpdm_cmd connect project login options adv_options test_options

Arguments

connect

Optional. Identifies the execution server. If missing, rpdm_cmd will attempt to connect to "localhost" using a default port of 20421. If present, must be one of:

`-server` address	Connect to address (IP address or computer name) using default port of `20421`.
`-server` address:port	Connect to address (IP address or computer name) using port.

project

Required. The project or automation to run.

-project path

The URI path identifying a project or automation.

Path examples:

file:///path/.../file	Project on execution server file system.
local:///path/.../file	Project on client computer file system.
file://host/path/.../file	Project on network share.
`repository:///path/.../fi`le	Project in site repository.

If any part of the project name contains a space, surround the project name with single or double quotes.

For load-balancing, you must use the -project option.

login

Required in Advanced Security mode, else optional. If missing, rpdm_cmd will use an internal login to run with Administrator rights. If present, it specifies the login and password. It must be of the form:

`-login` login	Login.
`-password` password	Password.
`-keyfile` keyfile	The Keyfile to use instead of password.

options

Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd:

`-D`parameter value	Set parameter to value.
`-connect` `ssl`\|`plain`\|`auto`	Set connection type. Defaults to `auto`.

adv_options

Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd:

`-logToFile` file name	Log execution to specified local file name.
`-resume`	Resume an automation from its saved state.
`-parameters` file name	The path and name of a delimited file containing variable values for the project (or automation). This file must be located on the execution server that will run the project. Each line should be of the format parameter_name,value. To pass a parameter to projects embedded within an automation, go to the Variables tab of the embedded project's Properties and map the automation variable to the project variable.
`-dynconfig` file name	The path and name of an XML file containing configuration directives that dynamically modify the project file settings. Any tool-configuration parameter specified in the project file can be changed at run-time. Examples: changing the input or output file names of a project; altering the selection of fields extracted to a file; specifying sort keys.
`-x32`	Run 32-bit (needed for some RDBMS drivers). Not available on Linux.
`-timestamp` label	Set repository view to this version label.
`-timestamp` timestamp	Set repository view to this timestamp; timestamp format is `MM/DD/YYYY-HH:MM:SS`, 24-hour time.
`-detach`	Run project in the background; return immediately.
`-balance`	Load balance on the cluster.
`-priority` number	Relative priority, when run on a cluster.

test_options

Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd:

`-traceLevel` level	Set level greater than `0` to get client-server trace log. A number between `1` and `5` determines the verbosity of the server trace logs, where `1` is "minimal", `3` is "chatty" and `5` is "debug only."
`-traceFile` file name	Set trace to specified file name instead of console.
`-repeat` count	Run the project count number of times.
`-nodetach`	Do not detach from project when finished.

Keyfiles

A keyfile is an encrypted alternative to a login and password. To make decryption difficult, the keyfile is encrypted using both a Data Management private key and an OS-specific computer-locked cryptography mechanism.

When running Data Management in Advanced Security mode, you can use keyfiles to authenticate users to the command-line processing utility rpdm_cmd.exe without entering a password. This avoids the insecure use of plaintext passwords.

To create a keyfile, run Data Management's command-line processing utility rpdm_shell.exe, and then type the command: GetKeyFile -login login -password password -filepath filepath where.

`login`	Login.
`password`	Password associated with the login.
`filepath`	Path to the location where the keyfile should be stored.

Once the keyfile is created, it may be used with rpdm_cmd to authenticate logins without using a password. Examples:
rpdm_cmd -login login -keyfile filepath -project project_file_path
rpdm_cmd -login login -keyfile filepath

Although the keyfile is encrypted, anyone able to read the file could impersonate the associated login to Data Management. Keyfiles should be readable only by those with legitimate access to the login.

Use of rpdm_shell is not officially supported and functionality may change between releases.

Scheduled execution

You may want to schedule projects for execution on the server for several reasons:

You want access to central resources or performance of the server computer.
You want the project to run regularly on an unattended basis.

Schedule project execution

To schedule a project for automatic execution:

Save the project to the repository.
In the repository, right-select the project, and then select Schedule Project.
On the Properties pane, select the Schedule tab and configure the desired execution schedule:
- Check Enabled.
- Specify the Time and date of execution. For recurring execution, the date and time dictate the first execution of the project.
- Specify the Frequency, or execution interval:

Frequency	Description
Once	The project is run exactly once at the date and time indicated. This is also useful to run a project immediately. The schedule object will be deleted from the repository as soon as the project is run.
Daily	The project is run daily at the time indicated.
Hourly	The project is run at one-hours increments from the time indicated.
Weekly	The project is run on the day-of-week and at the time indicated.
Monthly	The project is run on the day-of-month and at the time indicated. For day-of-month that exceed the number of days available in a month (for example, February does not have a 30th), the last day of the month is used.

Optionally, select Run on and specify where you want the project to run: a Specific machine or a Cluster.
If you selected Specific machine in step 4, select Machine and select the desired server.
Select Logon and specify the User and Password (which may be a Password or Key Vault reference) under which the project will run.

To edit the schedule later, open the Schedules branch of the repository tree.

To turn off a scheduled project, right-click the project in the repository tree and then select Disable.