Text functions: Part 4
Overview
Text functions manipulate Text values and return other Text values.
Right
Extracts a substring from a string. Starting from the right-most character, returns the first amount characters of value.
Syntax
Right( value, amount )
The Right function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) String you wish to extract from. May be of any type. |
amount | (Required) Number of characters that you wish to extract starting from the right-most character. Must be of type Integer. |
Examples
Right ("Alphabet Soup", 4)
returnsSoup
.Right ("Alphabet", 20)
returnAlphabet
.
SortWords
Splits the input text into words, discarding punctuation, and returns a string with the words sorted and capitalized. This can be useful for matching.
Syntax
SortWords( text )
The required argument text must be of type Text.
Example
SortWords("the quick brown fox")
returns BROWN FOX QUICK THE
.
Soundex
Computes the soundex of a text value.
Syntax
Soundex( value )
The required argument value may be of any type.
Remarks
This function computes the "soundex" of a text value, as defined by Donald Knuth. Soundex is useful for comparing English words (especially English surnames) in a loose fashion, because it transforms many similar names to the same index value.
Use the Soundex function in a deduplication or matching project as follows:
Identify the sort/compare steps where you currently use fuzzy matching on a text value, and want to substitute Soundex matching.
Insert a Calculate tool upstream of the sort/compare steps.
For each field value you want to compare using soundex, assign the expression Soundex(value) to value.
In each Window Compare tool, change the Comparison kind (Edit-Distance or Word-by-Word) to Positional, and set the All-or-nothing threshold to 100. Because the soundex function does a "fuzzy mapping" from words to index values, you want an exact match on the soundex value itself.
Examples
Soundex("JOHN")
returnsJ500
.Soundex("CHARTIER")
returnsC636
.Soundex("BRADY")
returnsB630
.Soundex("SPARKS")
returnsS162
.Soundex("SPANOS")
returnsS152
.
Soundex2
Computes the soundex of a text value with reference to English names.
Syntax
Soundex2( value )
The required argument value may be of any type.
Remarks
The Soundex2 function is like the Soundex function with some additional processing designed for English names. In order to compare phonetically similar words, it performs replacements like the following before computing the standard Soundex value.
PH
becomesF
.KN
becomesN
.MAC
becomesMC
.
Soundex3
Computes the soundex of a text value, with reference to numbers as well as English names.
Syntax
Soundex3( value )
The required argument value may be of any type.
Remarks
The Soundex3 function is identical to the Soundex2 function, except that it supports numbers in addition to words. If you have values with mixed words and numbers, this function will produce results that allow you to distinguish between the two. This is useful for street names, where any digits found are usually the important part. For example, it will produce the following results for these inputs:
GRANT
becomesG653
.MAIN
becomesN500
.1ST
becomes1
.64TH AVE
becomes64
.WASHINGTON
becomesW252
.
Substitute
Returns value, modified such that all occurrences of characters in find have been replaced with corresponding characters in replace_with.
Syntax
Substitute( value, find, replace_with )
The Substitute function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) Text expression upon which the replacement is performed. May be of any type. |
find | (Required) String containing the set of characters be replaced. Must be of type Text. |
replace_with | (Required) String containing the set of characters to substitute for those replaced. This string may be empty, in which case the characters are removed from the string.Must be of type Text. |
Remarks
The Substitute function operates on value, looking for all occurrences of characters in the find string, and replacing them with corresponding characters in the replace_with string.
Both the find and replace_with strings may contain one or more ranges, representing larger sets of characters. Ranges take the form:
a-z (for lowercase letters)
A-Z (for uppercase letters)
0-9 (for digits)
The dash character ("-") in the find and replace_with strings is interpreted specially. If you want to include a literal dash character in either string, place it at the beginning or end of the string.
The number of characters in the find and replace_with string must be identical (unless replace_with is empty or contains a single character).
Examples
Substitute("John Smith", "a-z", "A-Z")
returnsJOHN SMITH
.Substitute("a,b.c/d;e", ",./;", " ")
returnsa b c d e
.Substitute("(123)456-7890", "()-", "")
returns1234567890
.Substitute("1234567890", "0-9", "A-J")
returnsABCDEFGHIJ
.Substitute("1234567890", "0-9", "A-J")
returnsABCDEFGHIJ
.Substitute("1234567890", "0-9", "X")
returnsXXXXXXXXXX
.
SubString
Returns a substring of value, starting at position.
Syntax
SubString( value, position, length )
The SubString function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) Text string to be manipulated. May be of any type. |
position | (Required) Position of starting character (Integer). |
length | (Required) Number of characters to return (Integer). |
Remarks
The position and length arguments can be any expression, even those involving fields. Positions are one-based. Fewer characters than length may be returned if value does not contain position + length characters.
Examples
SubString( "Acme Software", 6, 4 )
returnsSoft
.Substring( "Acme Software", 6, 100 )
returnsSoftware
.Substring( NAME, POSITION, SIZE)
extracts SIZE characters from NAME starting at POSITION.
TitleCase
Returns text, with the first letter of each word capitalized, and all other letters lowercase.
Syntax
TitleCase( text )
The required argument value must be of type Text.
Example
TitleCase( "JOHN SMITH" )
returns John Smith
.
TitleCase2
Returns text, with the first letter of each word capitalized, and all other letters lowercase, taking into account some special cases of (Western) names.
Syntax
TitleCase2( text )
The required argument value must be of type Text.
Example
TitleCase2( "JOHN SMITH" )
returnsJohn Smith
.TitleCase2( "JOHN MACDOUGAL" )
returnsJohn MacDougal
.TitleCase2( "JOHN O'BRIAN" )
returnsJohn O'Brian
.
Trim
Removes leading and trailing trim characters from the value, and returns the resulting text string. When trim_chars is omitted, returns value with the leading and trailing blanks removed. When trim_chars is specified, returns value with all leading and trailing trim_chars characters removed.
Syntax
Trim( value )
Trim( value, trim_chars )
The Trim function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) Text string to be trimmed. May be of any type. |
trim_chars | (Optional) Text characters to be removed. May be of any type. |
Remarks
The required value argument is any valid text expression. Note that trim_chars may contain more than one character. You can, for example, remove both tabs and spaces in one function call.
Examples
Trim( " Acme Software ")
returnsAcme Software
.Trim( " Ajax Motors**" , "* " )
returnsAjax Motors
.
TrimLeft
Removes leading characters from value, and returns the resulting text string. When trim_chars is omitted, returns value with the leading blanks removed. When trim_chars is specified, returns value with all leading trim_chars characters removed.
Syntax
TrimLeft( value )
TrimLeft( value, trim_chars )
The TrimLeft function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) Text string to be trimmed. May be of any type. |
trim_chars | (Optional) Text characters to be removed. May be of any type. |
Remarks
The required value argument is any valid text expression. Note that trim_chars may contain more than one character. You can, for example, remove both tabs and spaces in one function call.
Examples
TrimLeft( " Acme Software ")
returnsAcme Software
.TrimLeft( " *Ajax Motors**" , "* " )
returnsAjax Motors**
.
TrimRight
Removes trailing characters from value, and returns the resulting text string. When trim_chars is omitted, returns value with the trailing blanks removed. When trim_chars is specified, returns value with all trailing trim_chars characters removed.
Syntax
TrimRight( value )
TrimRight( value, trim_chars )
The TrimRight function syntax has the following arguments.
Part | Description |
---|---|
value | (Required) Text string to be trimmed. May be of any type. |
trim_chars | (Optional) Text characters to be removed. May be of any type. |
Remarks
The required value argument is any valid text expression. Note that trim_chars may contain more than one character. You can, for example, remove both tabs and spaces in one function call.
Examples
TrimRight( " Acme Software ")
returnsAcme Software
.TrimRight( " *Ajax Motors** " , "* " )
returns*Ajax Motors
.
UpperCase
Returns value that has been converted to uppercase.
Syntax
UpperCase( value )
The required argument value may be of any type.
Remarks
Only lowercase letters are converted to uppercase; all uppercase letters and non-letter characters remain unchanged.
Example
UpperCase( "3482 Walnut Street" )
returns3482 WALNUT STREET
.