Skip to main content
Skip table of contents

Record order and performance

Overview

Data Management can optimize performance based on what it knows about the sort orders of records as they flow through the tools and connections. Tools that operate on sorted records (the Sort, Join, Unique, Rank, and Summarize tools) may skip some or all of the sorting if the records are already ordered, or partially ordered. Data Management's sort optimization rules are:

  • If the required sort order matches or is a subset of the input record order, skip the sort entirely.

  • If the required sort order is a further ordering of partially-ordered data, perform an optimized sort.

Information about how records are ordered at various points in the processing diagram is passed between tools as meta-data. Each tool describes its current record order to the tools downstream. For example, the Sort tool "tells" the downstream tools that it has ordered the records according to the tool's sort-key configuration.

Viewing record order

You can examine Data Management's record order meta-data at any point in the processing diagram in the Schema viewer.

  • To display a minimized Schema viewer, select Schema on the View menu.

When you select a connector, the Schema pane shows the fields and record orders of the data passing through the connector. For example, the output connection of a Sort tool that has sorted data by ZIP (ascending) and COMPANY (descending) shows the following schema:

If you add a Number Records tool after the Sort to produce a RECORD_ID field and select its output connection, the Schema pane shows:

Note that there are now two record orders that are simultaneously valid, because the record numbering did not perturb the original sort order.

Selecting a tool while viewing the Schema pane will display similar information about the tool's input and output connections.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.