Query expressions
Overview
Data sets indexed using Data Management's Create Index tool can be queried using the Query Indexed Table tool and Data Management's query expression language. Queries are treated as Boolean expressions and used to identify a set of records for counts or extraction.
A query expression consists of one or more terms. Each term in a query expression yields a set of records. These records sets can be grouped with parentheses and combined using the AND and OR operators to form other record sets.
Combining query terms
You can combine query terms using the Boolean operations AND, OR, and NOT. Parentheses can be used to determine the order of evaluation of the terms.
AND operation
Placing the AND operator between two terms produces the set of records that appear in both terms. For example ZIP = 80303 AND NAME >= "A" AND NAME < "B"
finds records with a name starting with "A" in the 80303 ZIP code.
OR operation
Placing the OR operator between two terms produces the set of records that appears in either term. For example ZIP = 80303 OR Circle(40.0 --102.0 10.0)
finds records whose ZIP is 80303 or whose latitude/longitude coordinates fall within 10 miles of (40, -102).
Not operation
Placing the NOT operator before a term returns the set of all records that are not in that term. For example NOT ZIP = 80303
is the same as ZIP != 80303
.
Another example, NOT (NAME = "" OR CITY = "")
is equivalent to NAME != "" AND CITY != ""
.
Grouping operations
Normally, the AND operation "binds" more tightly than the OR operation, so the expression NAME < "B" AND ZIP = 80303 OR ZIP = 80304
may not do what you expect because it is evaluated as if you specified the grouping (NAME < "B" AND ZIP = 80303) OR ZIP = 80304
. To group the two ZIP terms together, surround them with parentheses, like NAME < "B" AND (ZIP = 80303 OR ZIP = 80304)
.
Field value query terms
Field value query terms test a field value against a specified literal value.
Term | Meaning |
---|---|
| The set of records where |
| The set of records where |
| The set of records where |
| The set of records where |
| The set of records where |
| The set of records where |
| The set of records where |
Examples:
NAME >= "F"
AGE = 30
YEAR < 1990
LASTNAME != ""
NAME StartsWith("A")
is equivalent to (but much faster than) NAME >= "A" AND NAME < "B"
.
Contains query term
Searches for text fields containing a specified string.
Syntax
FIELD Contains (string)
Example
NAME Contains "VAN"
matches "VAN DER WAALS", "JOHN VAN SMITH" and "DONOVAN".
Literal values
Literal values are used in query expressions when comparing a field against a constant. Literal values can be strings, numbers, dates, and times.
Literal value type | Format | Example |
---|---|---|
String | <letter>+ |
|
Integer | <optional sign><digit>+ |
|
Float | <optional sign><digit>+.digit+ |
|
Float | <optional sign>.digit+ |
|
Float | <optional sign><digit>+.digit+ E <sign> digit+ |
|
Date | DD/MM/YY |
|
Time | HH:MM<optional seconds> |
|
Types and validity
When comparing a field to a literal, the type of the field and literal must match:
Field type | Allowed literal type |
---|---|
BigInt | Integer or Float or Null |
Float | Integer or Float or Null |
TextFixed | String or Null |
TextVar | String or Null |
Date | Date or Null |
Time | Time or Null |
DateTime | N/A |
Boolean | Boolean (True or False) or Null |
Binary | N/A |
Startswith query term
Searches for text fields starting with a specified string.
Syntax
FIELD Startswith (string)
Example
NAME Startswith "AL"
matches "ALBERT", and "AL BARTLET", but not "AMED AL JABAR" or "BLUMENTHAL".
Endswith query term
Searches for text fields ending with specified string.
Syntax
FIELD Endswith (string)
Example
NAME Endswith SMITH "AL"
matches "BLUMENTHAL" and "RONAL" but not "ALBERT".
Range query term
Identifies the set of records with values included within a specified range
Syntax
FIELD Range (min_value max_value)
Remarks
This query term produces the same result as FIELD >= min_value AND FIELD <= max_value
. However, you should always use the Range query term because it is more efficient.
Example
NAME Range ("A" "Mz")
gets all NAME
values starting with the letters A through M.
The "Mz"
argument signifies that the "M" is inclusive—the "z" tells Data Management to return all NAMES starting with M.
Circle query term
Identifies the set of records whose spatial coordinates fall within the boundary of the specified circle.
Syntax
Circle(latitude longitude radius)
Both latitude and longitude must be expressed in degrees; radius must be expressed in miles.
Remarks
In order to use the Circle term, you must create a spatial index. See Configuring the Create Index tool for more information.
Example
Circle (40.0 --105.3 10.0)
Set query term
Identifies the set of records with values equal to any of a list of literal values.
Syntax
FIELD Set (value1 value2 value3 ... )
Remarks
The Set term produces the set of records where FIELD
is equal to any of the literal values. It is semantically equivalent to a string of OR operations, but the Set term is much more efficient.
Example
NAME SET ( "JOHN" "NATHAN" "LAURA")
is the same as NAME = "JOHN" OR NAME = "NATHAN" OR NAME = "LAURA"
.
DynSet query term
Identifies the set of records with values equal to any of a dynamically defined list of values.
Syntax
FIELD DynSet (input_number input_field)
The input_number parameter is the input position between 1 and the number of inputs connected to the Query Indexed Table tool. The input_field parameter is the name of the input field from which the set of values will be taken.
Remarks
The DynSet term is designed to support parameterization of the query from external sources so that the binding of the query values can be deferred until the query is run.
The term identifies the set of records where FIELD is equal to one of the values defined by the specified input_number and input_field. The DynSet term can only be used when executing a project where the Query Indexed Table tool is connected to one or more inputs.
Example
The DynSet term is semantically equivalent to a Set term. For example, given the query SIC DynSet(1 PRMSIC)
where the PRMSIC
field on the first input contains the values 111111 222222 333333
the dynset term produces the same result as SIC Set (111111 222222 333333)
.
Polygon query term
Identifies the set of records whose spatial coordinates fall within the boundary of the specified polygon.
Syntax
Polygon(polygon_name)
Remarks
The Polygon term is similar to the Circle term. It produces the set of records whose spatial coordinates fall within polygon_name, a set of named polygons.
In order to use the Polygon term, you must create both a spatial index and a polygon subsystem. See Configuring the Create Index tool and Creating a polygon subsystem for more information.
Example
Polygon("BOULDER_COUNTY")
searches the polygon subsystem for a polygon named "BOULDER_COUNTY", retrieves it, and uses that polygon as the basis for the query.
DynPolygon query term
Identifies the set of records whose spatial coordinates fall within the boundary of a dynamically specified polygon.
Syntax
DynPolygon(input_number input_field)
The input_number parameter is the input position between 1 and the number of inputs connected to the Query Indexed Table tool. The input_field parameter is the name of the input field from which the set of values will be taken.
Remarks
The DynPolygon term is designed to support parameterization of the query from external sources so that the binding of the query values can be deferred until the query is run.
The term identifies the set of records where the spatial coordinates fall within one of the polygon values appearing on the specified input_number and input_field. The DynPolygon term can only be used when executing a project where the Query Indexed Table tool is connected to one or more inputs.
In order to use the DynPolygon term, you must create a spatial index. See Configuring the Create Index tool for more information.
Example
SIC DynPolygon(1 POLYGON)
reads polygons from the POLYGON field of the first input, and find all records whose spatial coordinate falls within one of those polygons.
Query expression syntax
The syntax of the query language is described using EBNF notation, where the following meta-symbols are used. Note that these constructs are listed in order of precedence, from high to low.
Meta-symbol | Meaning |
---|---|
| Explicit precedence grouping to ensure that stuff is not confused with anything outside the parentheses |
| The literal character X |
| The sequence of characters composing the string. In this example, equivalent to 's' 't' 'r' 'I' 'n' 'g' |
| All characters between c1 and c2, inclusive |
| Equivalent to [ 'a' | 'b' | 'c' ] |
| Matches any character except 'a' 'b' 'c' |
| Zero or more X |
| One or more X |
| Optional X |
| X followed by Y |
| Either X or Y |
Query expression language lexical syntax
Below is a formal definition of the lexical syntax of the query expression language. You don't need this information to use the query expression language. However, if you are experienced in computer science and need technical details about the query language syntax, you may find this information useful.
The lexical syntax of the query language is a formal specification of the tokens used in the query language grammar specification. In the following lexical syntax, the capitalized non-terminals are token types, and the lowercase non-terminals are intermediate rules.
whitespace := ('\t' | '\n' | '\r' | ' ')+
digit := '0' .. '9'
letter := 'a' .. 'z' | 'A' .. 'Z'
exponent := ('e'|'E')('+'|'-')digit+
FieldSpecifier :=
letter (letter | digit | '_')* |
'[' [^\]] ']'
Integer := ('-')? digit+
Float :=
('-')? digit+ (exponent | '.' digit* exponent?) |
('-')? '.' digit+ exponent?
Number := Integer | Float
String := '"' ([^\\"] | '\\' ('\\' | '"'))* '"'
Comma := ','
LeftParen := '('
RightParen := ')'
LessThan := '<'
GreaterThan := '>'
LessThanOrEqual := "<="
GreaterThanOrEqual := ">="
NotEqual := "!=" | "<>"
And := "AND"
Or := "OR"
Not := "NOT"
True := "TRUE"
False := "FALSE"
Null := "NULL"
Circle := "CIRCLE"
Set := "SET"
DynSet := "DYNSET"
Polygon := "POLYGON"
DynPolygon := "DYNPOLYGON"
Special identifier tokens like AND are case-insensitive.
Query expression language grammar
Below is a formal definition of the grammar of the query expression language. You don't need this information to use the query expression language. However, if you are experienced in computer science and need technical details about the query language grammar, you may find this information useful.
The "Expr" production in the grammar is the top-level production. In other words, it defines a valid query expression.
Expr := OrExpr
OrExpr := AndExpr (Or AndExpr) *
AndExpr := UnaryExpr (And UnaryExpr) *
BinaryOp :=
Equal | NotEqual | LessThan | GreaterThan |
LessThanOrEqual | GreaterThanOrEqual
UnaryExpr := (UnaryOp UnaryExpr) | AtomicExpr
UnaryOp := Not
AtomicExpr :=
FieldSpecifier BinaryOp Literal |
LeftParen OrExpr RightParen
FieldSpecifier Range LeftParen Literal Literal RightParen
Circle LeftParen Number Number Number RightParen
FieldSpecifier Set LeftParen Literal+ RightParen
FieldSpecifier DynSet LeftParen Integer FieldSpecifier RightParen
Polygon LeftParen String RightParen
DynPolygon LeftParen Integer FieldSpecifier RightParen
Literal :=
Integer |
Float |
String |
True |
False |
DateLiteral |
TimeLiteral |
Null
DateLiteral := Integer '/' Integer '/' Integer
TimeLiteral := Integer ':' Integer (':' Integer)?