Skip to main content
Skip table of contents

Avoid split-merge bottlenecks

In Data Management, a common strategy is to split the record flow using the Filter tool, perform different processing steps on the split record streams, and then merge them back together using the Merge tool. By default, Merge uses a "greedy" merge technique, which is very fast but disturbs the sort order of the data. If you merge using the "greedy" setting, you will need to re-sort the records.

image-20240326-233244.png

The Merge tool has settings that are better-suited to this kind of processing. If your records are already sorted on some field (ID in this example), you can specify Merge type as Sorted.

image-20240326-233305.png

Select the ID Field, and sort Order.

image-20240326-233326.png

With a sorted merge, you can eliminate the final sort tool.

image-20240326-233353.png

Sometimes your records aren't sorted by any fields, but you want to preserve the original record order. In this case, the Filter and Merge tools can track the record order using a sequence field.

To preserve record order using a sequence field:

  1. Configure the Filter tool's Sequence tab to generate sequence values and append them to the records.

image-20240326-233420.png
  1. Next, configure the Merge tool for a sequence merge.

image-20240326-233440.png
  1. Optionally, insert a Select tool before the output to remove the SEQUENCE field.

image-20240326-233502.png

Despite requiring an additional Select tool, the Sequence merge type is usually faster than the Sorted merge type. See the sample project filter_merge_sequence for another example of this technique.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.