Filter
Overview
The Filter tool lets you extract a subset of records from the tool's input, based on filtering criteria that you specify. This function is similar to the WHERE clause in SQL, except that it does not join multiple tables (use the Join tool to join tables).
Use the Filter tool anywhere you want to select or eliminate records. For example, you can:
Remove records containing invalid field values, or send them along a different processing path for cleansing. The functions IsValid, IsNull, and IsError can all be used in a Filter expression to identify problem records.
Extract mailing-list records falling within a certain set of ZIP Codes.
Separate "variant" records from legacy files. Files with variant records have different record layouts within the same file. A "sentinel" field indicating the record type often flags these variants. For example, the U.S. Postal Service "City State" file contains several variants, with a sentinel in the first column indicating variants: "C"=Copyright, "A"=Alias, "S"=Scheme, "Z"=Zone, "D"=Detail.
Extract records from a large data set using arbitrary selection criteria. Even though this process reads and tests every record in the file sequentially, it is often much faster than a database extract because of the sequential speed advantage.
Filter accepts a single input table, and generates two or more outputs depending on the operating Mode. Each record is always sent to exactly one output, regardless of mode.
The Filter tool can also be used to generate a sequence number and attach it to each record, assisting in reassembling a record stream that has split using the Filter tool. You can configure this option on the Sequence tab of the tool's configuration dialog box. See Avoiding split-merge bottlenecks for more information on this feature.
Filter tool configuration parameters
The Filter tool has two sets of configuration parameters in addition to the standard execution options:
Configuration
Parameter | Description |
---|---|
Mode | Determines Filter tool's mode of operation: Yes/No, Multiple Expressions, or Values. |
Condition (boolean) | If Mode is Yes/No, expression used to filter the data. This expression must evaluate to a Boolean value. |
Conditions | If Mode is Multiple Expressions, list of Expressions and associated Outputs. |
Values | If Mode is Values, list of Value/Output pairs. |
Case insensitive | If Mode is Values, perform case-insensitive comparisons. This is optional and defaults to case-sensitive. |
Sequence
Optionally, the Sequence tab can be used to generate a sequence field on output. This is typically used to assist in recombining records in order when they are split by the filter.
Option | Description |
---|---|
Generate sequence | Name of the field to receive the sequence number. |
Start with | Start of sequence. Defaults to 0 (first sequence number 1). |
Configure the Filter tool
Select the Filter tool, and then go to the Configuration tab on the Properties pane.
Select a Mode from the list:
Yes/No
Multiple Expressions
Value
Depending on which Mode you selected, configure Conditions, Expressions, or Values:
Yes/No: enter a single Condition to evaluate. Use any combination of mathematical operations and functions that are appropriate for your field data. The filter expression must always evaluate to a value of
True
(Y
output) orFalse
(N
output).
As you type, any errors in expression syntax are listed in the Errors and Messages box. Selecting an error message will highlight the incorrect part of the expression. The expression editor’s autocomplete feature will show a pop-up list of fields, variables, and functions as you type. You can also select Insert and use the Functions, System Variables, and Fields lists to help you construct your expression more quickly.Multiple Expressions: enter a list of Outputs and their corresponding Expressions. The Filter tool creates an output connector for each Output, as well as a default connector (D output). At run time, expressions are applied sequentially. The first expression evaluating to True causes the record to be sent to the corresponding output. If no expression is True, the record is send to the default connector.
Values: select a Field to evaluate and then enter a list of Outputs and their corresponding Values. You may optionally select Case insensitive operation. The Filter tool creates an output connector for each Output, as well as a default connector (D output). At run time, Field value is compared to the table of values, and if a match is found, the record is sent the corresponding output. If no match is found, the record is send to the default connector.
Optionally, select the Sequence tab and choose Generate sequence. Specify a Sequence field and a Start with value. (See Avoiding split-merge bottlenecks for more information on this feature, or sample project filter_merge_sequence to see it in action.)
Optionally, go to the Execution tab, and then set Report options and Web service options.
Filter expression examples
Choose all records with dates in the current year:
DATE >= MakeDate(1, 1, Year(CurrentDate())
Choose all records with dates in the previous year:
DATE >= MakeDate(1, 1, Year(CurrentDate()) - 1) AND
DATE < MakeDate(1, 1, Year(CurrentDate())
Choose all records with dates falling on the second Tuesday of any month:
Day(DATE) >= 8 AND Day(DATE) <= 14 AND DayOfWeek(DATE) = 3
Choose all records with names beginning with "F":
IsLike(NAME, "F%")
Choose all records with name starting in "A" or ending in "Z":
IsLike(NAME, "A%") OR IsLike(Trim(NAME), "%Z")
You must use Trim
to ignore trailing blanks in the name.