Regular expression functions (deprecated Data Management variant)
Overview
These functions have been retained for backwards compatibility, but you should use the newer equivalents prefixed with Regex instead. See About regular expressions (Data Management variant) for more information on the Data Management variant regular expression language.
RegularExprMatch
This function has been deprecated. You should use RegexMatch instead.
Returns a Boolean value, which is True if text matches the given regular expression, and False otherwise.
Syntax
RegularExprMatch( text, regexp )
The RegularExprMatch function syntax has the following arguments:
Part | Description |
---|---|
text | Required. Any expression that produces a text result. |
regexp | Required. A literal string containing a regular expression. |
Remarks
The function RegularExprMatch performs a similar function to the "regexp" library found on Linux and UNIX systems. It tests whether a string matches a given regular expression or not.
Note that the regular expression syntax used by this function and related functions is slightly different from that used by UNIX systems. See Regular expression syntax for details.
Examples
Matching a Social Security Number:
RegularExprMatch("123-45-6789", "d{3} '-' d{2} '-' d{4}" )
returns True.
Ensuring that a string contains only lowercase characters and spaces:
RegularExprMatch("Now is the time", "['a'-'z' ' ']*")
returns False.RegularExprMatch("1am is the time", "['a'-'z' ' ']*")
returns False.RegularExprMatch("now is the time", "['a'-'z' ' ']*")
returns True.
Is a string enclosed in parentheses?
RegularExprMatch("(now is the time)", " '(' .* ')' " )
returns True.
Because the regular expression is surrounded by double-quotes when passed to this function, you must escape any double-quotes that are part of the regular expression by preceding them with the \ character. For example, to see if the field NAME is a quoted string, use the following expression: RegularExprMatch(NAME, " '\"' .+ '\"' ")
.
Taking this somewhat complex string apart, the outermost double-quotes enclose the regexp argument. Inside, the single-quotes designate a single character to be matched. That character is a double-quote, and thus is preceded by a \ to distinguish it from the double-quote that marks the end of the string.
See also:
RegularExprExtract
This function has been deprecated. You should use RegexExtractAll or RegexExtractFirst instead.
Returns a Text value, which contains sub-parts extracted from text if it matches the given regular expression, and an empty string otherwise.
Syntax
RegularExprExtract( text, regexp )
The RegularExprExtract function syntax has the following arguments:
Part | Description |
text | Required. Any expression that produces a text result. |
regexp | Required. A literal string containing a regular expression, including "actions" to extract text from the string. |
Remarks
The function RegularExprExtract tests whether a string matches a given regular expression or not, and if so, extracts sub-parts of the string. It is similar to functions in the "regexp" library found on Linux and UNIX systems.
Note that the regular expression syntax used by this function and related functions is slightly different from that used by UNIX systems. In particular, the syntax used by Data Management allows for much more precise specification of which parts of a string are to be extracted. Note especially the = sign used to denote the action of extracting a sub-part See Regular expression syntax for details.
Examples
Matching a Social Security Number and extracting the middle digits:
RegularExprExtract("123-45-6789","d{3} '-' =d{2} '-' d{4}")
returns "45".
Extracting the first sequence of digits from a string:
RegularExprExtract("'69 Chevy, $1000 obo","[^'0'-'9']* =d+ .*")
returns "69".
Extracting the text between parentheses:
RegularExprExtract("303-5555-1212 (ext 234)", "[^'(']* '(' =[^')']+ ')' .* ")
returns "ext 234".
Because the regular expression is surrounded by double-quotes when passed to this function, you must "escape" any double-quotes that are part of the regular expression by preceding them with the \ character. For example, to extract the contents of a NAME field that is a quoted string, use the following expression:
RegularExprExtract(NAME, " '\"' =.+ '\"' ")
Taking this somewhat complex string apart, the outermost double-quotes enclose the regexp argument to RegularExprExtract. Within that, the single-quotes designate a single character to be matched. That character is a double-quote, and so is preceded by a \ to distinguish it from the double-quote that marks the end of the string. Finally, the = designates the middle of the string to be extracted.
See also:
RegularExprFormat
This function has been deprecated. You should use RegexMatch instead.
Returns a Text value, which contains sub-parts extracted from text if it matches the given regular expression, formatted according to a specified format string.
Syntax
RegularExprFormat( text, regexp, format )
The RegularExprMatch function syntax has the following arguments:
Part | Description |
---|---|
text | Required. Any expression that produces a text result. |
regexp | Required. A literal string containing a regular expression, including "numbered actions" to extract test from the string. |
format | A format string that references extracted portions of the original text. |
Remarks
The function RegularExprFormat performs a similar function to the "regexp" library found on Linux and UNIX systems. It tests whether a string matches a given regular expression or not, and if so, extracts sub-parts of the string, and inserts those sub-parts into a format string.
Note that the regular expression syntax used by this function and related functions is slightly different from that used by UNIX systems. See Regular expression syntax for details. In particular, the syntax used by Data Management allows for much more precise specification of which parts of a string are to be extracted.
In particular, note that the = sign preceded by a number in the regexp argument is used to extract a sub-part of text and give it a number indicating its position in the format string. In the format string, that number is referenced by surrounding it with %%.
Examples
Reformatting a Social Security Number that has the dashes removed:
RegularExprFormat("123456789","1=d{3} 2=d{2} 3=d{4}","%1%-%2%-%3%")
returns "123-45-6789".
Reformatting a phone number:
RegularExprFormat( "(123)-456-7890"," '(' 1=d{3] '-' 2=d{3} '-' 3=d{4} ","%1%.%2%.%3%)
returns "123.456.7890".
Because the regular expression is surrounded by double-quotes when passed to this function, you must escape any double-quotes that are part of the regular expression by preceding them with the \ character. For example, the following expression extracts the contents of a NAME field that is a quoted string, and places it within single quotes instead:
RegularExprFormat(NAME, " '\"' 1=.+ '\"' ", "'%1%'")
Taking apart the second argument, the outermost double-quotes enclose the regexp argument. Within that, the single-quotes designate a single character to be matched. That character is a double-quote, and thus is preceded by a \ to distinguish it from the double-quote that marks the end of the string. Finally, the 1= designates the portion of the string to be extracted. In the third argument, the double-quotes surround the argument, but the single-quotes are actual characters in the format string that will appear in the output of this function.
RegularExprReplaceFirst
This function has been deprecated. You should use RegexReplaceFirst instead.
Replaces the first occurrence in text of text matching pattern with replacement.
Syntax
RegularExprReplaceFirst( text, pattern, replacement )
The RegularExprReplaceFirst function syntax has the following arguments:
Part | Description |
---|---|
text | Required. Any expression that produces a text result. |
pattern | Required. A literal string containing a regular expression. |
replacement | Required. Any expression that produces a text value. |
Remarks
The function RegularExprReplaceFirst tests whether a string matches a given regular expression or not, and if so, replaces the string with a replacement string.
The literal text contained in pattern must be quoted, and the entire pattern is a string, so you will typically need to use double-quotes around the entire pattern and single-quotes around the literal text inside the pattern. In the example below note the extra quotes around cat.
Examples
RegularExprReplaceFirst("cats and dogs", "'cat'", "dog")
will produce the output dogs and dogs
.
RegularExprReplaceAll
This function has been deprecated. You should use RegexReplaceAll instead.
Replaces all occurrences in text of text matching pattern with replacement.
Syntax
RegularExprReplaceAll( text, pattern, replacement )
The RegularExprReplaceAll function syntax has the following arguments:
Part | Description |
---|---|
text | Required. Any expression that produces a text result. |
pattern | Required. A literal string containing a regular expression. |
replacement | Required. Any expression that produces a text value. |
Remarks
The function RegularExprReplaceAll tests whether a string matches a given regular expression or not, and if so, replaces all occurrences of the string with a replacement string.
The literal text contained in pattern must be quoted, and the entire pattern is a string, so you will typically need to use double-quotes around the entire pattern and single-quotes around the literal text inside the pattern. In the example below note the extra quotes around cat.
Examples
RegularExprReplaceAll("cats and dogs and more cats", "'cat'", "dog")
will produce the outputdogs and dogs and more dogs
.RegularExprReplaceAll("some extra space", "s+", " ")
will produce the outputsome extra space
.
Note that RegularExprReplaceAll will never match replacement text again, so when trying to replace one or more items (as in the whitespace example above) always use the + operator. Compare the example below:
RegularExprReplaceAll("some extra space", "' '", " ")
will produce the outputsome extra space
, replacing each two spaces with one, so each four spaces is replaced with two spaces.