Run projects
Overview
You can run Data Management projects in a variety of modes:
Run projects in the client
To run an open project in the client, select the green arrow on the status bar.
You can also run a project by choosing Run from the Project menu, by pressing Ctrl+R, or from the Run Control.
To cancel a running project in the client, select the Cancel button on the Status bar.
View embedded projects and macros
Projects and macros can be embedded within other projects. Use the View command on the project shortcut menu to monitor these sub-projects as they run.
To view embedded projects and macros as they run:
Open the project containing the embedded project or macro.
Right-click the embedded project or macro, and then select View Macro.
A new window displays the embedded project or macro. Note that the embedded project or macro view is read-only.
Data Viewer tools that are embedded in projects or macros are disabled by default. You can enable hidden Data Viewers in project settings.
Command-line processing
While Data Management's graphical user interface is an intuitive way to create data processing projects, there are times when you may want to run Data Management from the command line. For example, suppose you have a three-step process that you want to run every night:
Use the RDBMS Input tool to extract data from a Microsoft SQL Server database.
Use Data Management to process the records and perform duplicate removal.
Perform address standardization on the results using a third-party postal application.
Assuming that all of the applications are executable from the command-line, these steps can be easily written as a simple batch file. You can use Data Management's command-line processing utility rpdm_cmd.exe to integrate your Data Management projects into batch processes and schedule them for automatic execution.
Run projects from the command line
You can run the Data Management command-line using either a project stored on a local computer, or a project obtained from the repository. Projects run in a non-interactive manner can be monitored in the Management dashboard.
Syntax
rpdm_cmd connect project login options adv_options test_options
Arguments
connect
Optional. Identifies the execution server. If missing, rpdm_cmd
will attempt to connect to "localhost" using a default port of 20421
. If present, must be one of:
| Connect to address (IP address or computer name) using default port of |
| Connect to address (IP address or computer name) using port. |
project
Required. The project or automation to run.
| The URI path identifying a project or automation. |
Path examples:
file:///path/.../file | Project on execution server file system. |
local:///path/.../file | Project on client computer file system. |
file://host/path/.../file | Project on network share. |
| Project in site repository. |
If any part of the project name contains a space, surround the project name with single or double quotes.
For load-balancing, you must use the -project
option.
login
Required in Advanced Security mode, else optional. If missing, rpdm_cmd
will use an internal login to run with Administrator rights. If present, it specifies the login and password. It must be of the form:
| Login. |
| Password. |
| The Keyfile to use instead of password. |
options
Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd
:
| Set parameter to value. |
| Set connection type. Defaults to |
adv_options
Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd
:
| Log execution to specified local file name. |
| Resume an automation from its saved state. |
| The path and name of a delimited file containing variable values for the project (or automation). This file must be located on the execution server that will run the project. Each line should be of the format parameter_name,value. To pass a parameter to projects embedded within an automation, go to the Variables tab of the embedded project's Properties and map the automation variable to the project variable. |
| The path and name of an XML file containing configuration directives that dynamically modify the project file settings. Any tool-configuration parameter specified in the project file can be changed at run-time. Examples: changing the input or output file names of a project; altering the selection of fields extracted to a file; specifying sort keys. See programming_Data Management.doc for detailed information about the project file format and specifying dynamic configurations. |
| Run 32-bit (needed for some RDBMS drivers). Not available on Linux. |
| Set repository view to this version label. |
| Set repository view to this timestamp; |
| Run project in the background; return immediately. |
| Load balance on the cluster. |
| Relative priority, when run on a cluster. |
test_options
Optional. Can contain any of a list of optional parameters that control the execution of rpdm_cmd
:
| Set level greater than |
| Set trace to specified file name instead of console. |
| Run the project count number of times. |
| Do not detach from project when finished. |
Keyfiles
A keyfile is an encrypted alternative to a login and password. To make decryption difficult, the keyfile is encrypted using both a Data Management private key and an OS-specific computer-locked cryptography mechanism.
When running Data Management in Advanced Security mode, you can use keyfiles to authenticate users to the command-line processing utility rpdm_cmd.exe
without entering a password. This avoids the insecure use of plaintext passwords.
To create a keyfile, run Data Management's command-line processing utility rpdm_shell.exe
, and then type the command: GetKeyFile -login login -password password -filepath filepath
where.
| Login. |
| Password associated with the login. |
| Path to the location where the keyfile should be stored. |
Once the keyfile is created, it may be used with rpdm_cmd
to authenticate logins without using a password. Examples:rpdm_cmd -login login -keyfile filepath -project project_file_path
rpdm_cmd -login login -keyfile filepath
Although the keyfile is encrypted, anyone able to read the file could impersonate the associated login to Data Management. Keyfiles should be readable only by those with legitimate access to the login.
Use of rpdm_shell is not officially supported and functionality may change between releases.
Scheduled execution
You may want to schedule projects for execution on the server for several reasons:
You want access to central resources or performance of the server computer.
You want the project to run regularly on an unattended basis.
Schedule project execution
To schedule a project for automatic execution:
Save the project to the repository.
In the repository, right-select the project, and then select Schedule Project.
On the Properties pane, select the Schedule tab and configure the desired execution schedule:
Check Enabled.
Specify the Time and date of execution. For recurring execution, the date and time dictate the first execution of the project.
Specify the Frequency, or execution interval:
Frequency | Description |
---|---|
Once | The project is run exactly once at the date and time indicated. This is also useful to run a project immediately. The schedule object will be deleted from the repository as soon as the project is run. |
Daily | The project is run daily at the time indicated. |
Hourly | The project is run at one-hours increments from the time indicated. |
Weekly | The project is run on the day-of-week and at the time indicated. |
Monthly | The project is run on the day-of-month and at the time indicated. For day-of-month that exceed the number of days available in a month (for example, February does not have a 30th), the last day of the month is used. |
Optionally, select Run on and specify where you want the project to run: a Specific machine or a Cluster.
If you selected Specific machine in step 4, select Machine and select the desired server.
Select Logon and specify the User and Password (which may be a Password or Key Vault reference) under which the project will run.
To edit the schedule later, open the Schedules branch of the repository tree.
To turn off a scheduled project, right-click the project in the repository tree and then select Disable.