35.5 SQL Filter Expression Notation
A SQL filter expression is similar to the WHERE clause in SQL. The SQL expression format provided by Connext also supports the MATCH operator as an extended operator (see 35.5.5 SQL Extension: Regular Expression Matching).
The following sections provide more information:
- 35.5.1 Example SQL Filter Expressions
- 35.5.2 SQL Grammar
- 35.5.3 Token Expressions
- 35.5.4 Type Compatibility in the Predicate
- 35.5.5 SQL Extension: Regular Expression Matching
- 35.5.6 Composite Members
- 35.5.7 Strings
- 35.5.8 Enumerations
- 35.5.9 Pointers
- 35.5.10 Arrays
- 35.5.12 Sequences
35.5.1 Example SQL Filter Expressions
Assume that you have a Topic with two floats, X and Y, which are the coordinates of an object moving inside a rectangle measuring 200 x 200 units. This object moves quite a bit, generating lots of DDS samples that you are not interested in. Instead you only want to receive DDS samples outside the middle of the rectangle, as seen in Figure 35.1: Filtering Example . That is, you want to filter out data points in the gray box.
Figure 35.1: Filtering Example
The filter expression would look like this (remember the expression is written so that DDS samples that we do want will pass):
"(X < 50 or X > 150) and (Y < 50 or Y > 150)"
Suppose you would like the ability to adjust the coordinates that are considered outside the acceptable range (changing the size of the gray box). You can achieve this by changing the whole filter expression, using set_expression(), or by using filter parameters. The expression can be written using filter parameters as follows:
"(X < %0 or X > %1) and (Y < %2 or Y > %3)"
Recall that when you create a ContentFilteredTopic (see 35.2 Creating ContentFilteredTopics), you pass a expression_parameters string sequence as one of the parameters. Each element in the string sequence corresponds to one argument.
See the String and Sequence Support sections of the API Reference HTML documentation (from the Modules page, select RTI Connext API Reference, Infrastructure Module).
In C++, the filter parameters could be assigned like this:
FilterParameter[0] = "50";
FilterParameter[1] = "150";
FilterParameter[2] = "50";
FilterParameter[3] = "150";
With these parameters, the filter expression is identical to the first approach. However, it is now possible to change the parameters by calling set_expression_parameters(). For example, perhaps you decide that you only want to see data points where X < 10 or X > 190. To make this change:
FilterParameter[0] = 10
FilterParameter[1] = 190
set_expression_parameters(....)
The new filter parameters will affect all DataReaders that have been created with this ContentFilteredTopic.
35.5.2 SQL Grammar
This section describes the subset of SQL syntax, in Backus–Naur Form (BNF), that you can use to form filter expressions.
The following notational conventions are used:
NonTerminals are typeset in italics.
'Terminals' are quoted and typeset in a fixed-width font. They are written in upper case in most cases in the BNF-grammar below, but should be case insensitive.
TOKENS are typeset in bold.
The notation (element // ',') represents a non-empty, comma-separated list of elements.
FilterExpression ::= Condition
Condition ::= Predicate
| Condition 'AND' Condition
| Condition 'OR' Condition
| 'NOT' Condition
| '(' Condition ')'
Predicate ::= ComparisonPredicate
| BetweenPredicate
ComparisonPredicate ::= ComparisonTerm RelOp ComparisonTerm
ComparisonTerm ::= FieldIdentifier
| Parameter
BetweenPredicate ::= FieldIdentifier 'BETWEEN' Range
| FieldIdentifier 'NOT BETWEEN' Range
FieldIdentifier ::= FIELDNAME
| IDENTIFIER
RelOp ::= '=' | '>' | '>=' | '<' | '<=' | '<>' | 'LIKE' | 'MATCH'
Range ::= Parameter 'AND' Parameter
Parameter ::= INTEGERVALUE
| CHARVALUE
| FLOATVALUE
| STRING
| ENUMERATEDVALUE
| BOOLEANVALUE
| NULLVALUE
| PARAMETER
35.5.3 Token Expressions
The syntax and meaning of the tokens used in SQL grammar is described as follows:
IDENTIFIER—An identifier for a FIELDNAME, defined as any series of characters 'a', ..., 'z', 'A', ..., 'Z', '0', ..., '9', '_' but may not start with a digit.
IDENTIFIER: LETTER (PART_LETTER)*
where LETTER: [ "A"-"Z","_","a"-"z" ] PART_LETTER: [ "A"-"Z","_","a"-"z","0"-"9" ]
FIELDNAME—A reference to a field in the data structure. A dot '.' is used to navigate through nested structures. The number of dots that may be used in a FIELDNAME is unlimited. The FIELDNAME can refer to fields at any depth in the data structure. The names of the field are those specified in the IDL definition of the corresponding structure, which may or may not match the fieldnames that appear on the language-specific (e.g., C/C++, Java) mapping of the structure. To reference the n+1 element in an array or sequence, use the notation '[n]'
, where n
is a natural number (zero included). FIELDNAME must resolve to a primitive IDL type; that is boolean, octet, uint16, uint32, uint64, float double, char, wchar, string, wstring, or enum.
FIELDNAME: FieldNamePart ( "." FieldNamePart )*
where FieldNamePart : IDENTIFIER ( "[" Index "]" )* Index> : (["0"-"9"])+ | ["0x","0X"](["0"-"9", "A"-"F", "a"-"f"])+
Primitive IDL types referenced by FIELDNAME are treated as different types in Predicate according to the following table:
Predicate Data Type |
IDL Type |
BOOLEANVALUE |
boolean |
INTEGERVALUE |
octet, uint16, uint32, uint64 |
FLOATVALUE |
float, double |
CHARVALUE |
char, wchar |
STRING |
string, wstring |
ENUMERATEDVALUE |
enum |
TOPICNAME—An identifier for a topic, and is defined as any series of characters 'a', ..., 'z', 'A', ..., 'Z', '0', ..., '9', '_' but may not start with a digit.
TOPICNAME : IDENTIFIER
INTEGERVALUE—Any series of digits, optionally preceded by a plus or minus sign, representing a decimal integer value within the range of the system. 'L' or 'l' must be used for int64 (long long), otherwise int32 (long) is assumed. A hexadecimal number is preceded by 0x and must be a valid hexadecimal expression.
INTEGERVALUE : (["+","-"])? (["0"-"9"])+ [("L","l")]?
| (["+","-"])? ["0x","0X"](["0"-"9",
"A"-"F", "a"-"f"])+ [("L","l")]?
CHARVALUE—A single character enclosed between single quotes.
CHARVALUE : "'" (~["'"])? "'"
FLOATVALUE—Any series of digits, optionally preceded by a plus or minus sign and optionally including a floating point ('.'). 'F' or 'f' must be used for float, otherwise double is assumed. A power-of-ten expression may be postfixed, which has the syntax en or En, where n is a number, optionally preceded by a plus or minus sign.
FLOATVALUE : (["+","-"])? (["0"-"9"])* (".")? (["0"-"9"])+
(EXPONENT)?[("F",’f’)]?
where EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+
STRING—Any series of characters encapsulated in single quotes, except the single quote itself.
STRING : "'" (~["'"])* "'"
ENUMERATEDVALUE—A reference to a value declared within an enumeration. Enumerated values consist of the name of the enumeration label enclosed in single quotes. The name used for the enumeration label must correspond to the label names specified in the IDL definition of the enumeration.
ENUMERATEDVALUE : "'" ["A" - "Z", "a" - "z"]
["A" - "Z", "a" - "z", "_", "0" - "9"]* "'"
BOOLEANVALUE—Can either be TRUE or FALSE, and is case insensitive.
BOOLEANVALUE : ["TRUE","FALSE"]
NULLVALUE—Can be null, and is case insensitive.
NULLVALUE : "null"
PARAMETER—Takes the form %n, where n represents a natural number (zero included) smaller than 100. It refers to the (n + 1)th argument in the given context. This argument can only be in primitive type value format. It cannot be a FIELDNAME.
PARAMETER : "%" (["0"-"9"])+
35.5.4 Type Compatibility in the Predicate
As seen in Table 35.2 Valid Type Comparisons, only certain combinations of type comparisons are valid in the Predicate.
|
BOOLEAN |
INTEGER |
FLOAT |
CHAR |
STRING |
ENUMERATED |
BOOLEAN |
YES |
|
|
|
|
|
INTEGERVALUE |
|
YES |
YES |
|
|
|
FLOATVALUE |
|
YES |
YES |
|
|
|
CHARVALUE |
|
|
|
YES |
YES |
YES |
STRING |
|
|
|
YES |
YES 1 |
YES |
ENUMERATED |
|
YES |
|
YES2 |
YES 3 |
YES 4 |
35.5.5 SQL Extension: Regular Expression Matching
The relational operator MATCH may only be used with string fields. The right-hand operator is a string pattern. A string pattern specifies a template that the left-hand field must match.
MATCH is case-sensitive. The following characters have special meaning, unless escaped by the escape character: ,\/?*[]-^!\%
.
The pattern allows limited "wild card" matching under the rules in Table 35.3 Wild Card Matching.
The syntax is similar to the POSIX® fnmatch syntax. (See http://www.opengroup.org/onlinepubs/000095399/functions/fnmatch.html.) Some example expressions include:
This expression evaluates to TRUE if the value of symbol is equal to NASDAQ/MSFT:
symbol MATCH 'NASDAQ/MSFT'
This expression evaluates to TRUE if the value of symbol is equal to NASDAQ/IBM or NASDAQ/MSFT:
symbol MATCH 'NASDAQ/IBM,NASDAQ/MSFT'
This expression evaluates to TRUE if the value of symbol is equal to NASDAQ and starts with a letter between M and Y:
symbol MATCH 'NASDAQ/[M-Y]*'
Character |
Meaning |
, |
A , separates a list of alternate patterns. The field string is matched if it matches one or more of the patterns. |
/ |
A / in the pattern string matches a / in the field string. It separates a sequence of mandatory substrings. |
? |
A ? in the pattern string matches any single non-special characters in the field string. |
* |
A * in the pattern string matches 0 or more non-special characters in field string. |
% |
This special character is used to designate filter expression parameters. |
\ |
Escape character for special characters. |
[charlist] |
Matches any one of the characters in charlist. |
[!charlist] or [^charlist] |
(Not supported) Matches any one of the characters not in charlist. |
[s-e] |
Matches any character from s to e, inclusive. |
[!s-e] or [^s-e] |
(Not supported) Matches any character not in the interval s to e. |
Note: To use special characters as regular characters in regular expressions, you must escape them using the character '\
'. For example, 'A['
is considered a malformed expression and the result is undefined.
35.5.6 Composite Members
Any member can be used in the filter expression, with the following exceptions:
- 128-bit floating point numbers (long doubles) are not supported
- bitfields are not supported
- LIKE is not supported
Composite members are accessed using the familiar dot notation, such as "x.y.z > 5". For unions, the notation is special due to the nature of the IDL union type.
On the publishing side, you can access the union discriminator with myunion._d and the actual member with myunion._u.mymember. If you want to use a ContentFilteredTopic on the subscriber side and filter a DDS sample with a top-level union, you can access the union discriminator directly with _d and the actual member with mymember in the filter expression.
35.5.7 Strings
The filter expression and parameters can use IDL strings. String constants must appear between single quotation marks (').
For example:
" fish = 'salmon' "
Strings used as parameter values must contain the enclosing quotation marks (') within the parameter value; do not place the quotation marks within the expression statement. For example, the expression " symbol MATCH %0 " with parameter 0 set to " 'IBM' " is legal, whereas the expression " symbol MATCH '%0' " with parameter 0 set to " IBM " will not compile.
35.5.8 Enumerations
A filter expression can use enumeration values, such as GREEN, instead of the numerical value. For example, if x is an enumeration of GREEN, YELLOW and RED, the following expressions are valid:
"x = 'GREEN'"
"X < 'RED'"
35.5.9 Pointers
Pointers can be used in filter expressions and are automatically dereferenced to the correct value.
For example:
struct Point {
int32 x; int32 y; };
struct Rectangle { Point *u_l; Point *l_r; };
The following expression is valid on a Topic of type Rectangle:
"u_l.x > l_r.x"
35.5.10 Arrays
Arrays are accessed with the familiar [] notation.
For example:
struct ArrayType {
int32 value[255][5];
};
The following expression is valid on a Topic of type ArrayType:
"value[244][2] = 5"
In order to compare an array of bytes(octets in idl), instead of comparing each individual element of the array using [] notation, Connext provides a helper function, hex(). The hex() function can be used to represent an array of bytes (octets in IDL). To use the hex() function, use the notation &hex() and pass the byte array as a sequence of hexadecimal values.
For example:
&hex (07 08 09 0A 0B 0c 0D 0E 0F 10 11 12 13 14 15 16)
Here the leftmost-pair represents the byte at index 0.
Note: If the length of the octet array represented by the hex() function does not match the length of the field being compared, it will result in a compilation error.
For example:
struct ArrayType {
octet value[2]; };
The following expression is valid:
"value = &hex(12 0A)"
35.5.11 Optional Members
SQL filter expressions can refer to optional members. The syntax is the same as for any other member.
For example, given the type MyType:
struct Foo { string text; }; struct MyType { @optional int32 optional_member1; @optional Foo optional_member2; int32 non_optional_member; };
These are valid expressions:
"optional_member1 = 1 AND optional_member2.text = 'hello' AND non_optional_member = 2" "optional_member1 = null AND optional_member2.text <> null"
Any comparison involving an optional member (=, <>, <, or >) evaluates to false if the member is unset.
For example, both “optional_member1 <> 1” and “optional_member1 = 1” will evaluate to false if optional_member1 is unset; however “optional_member1 = 1 OR non_optional_member = 1” will be true if non_optional_member is equal to 1 (even if optional_member1 is unset). The expression “optional_member2.text = ‘hello’” will also be false if optional_member2 is unset.
To check if an optional member is set or unset, you can compare with the null keyword. The following expressions are supported:
"optional_member1 = null" *, *"optional_member1 <> null".
35.5.12 Sequences
Sequence elements can be accessed using the () or [] notation.
For example:
struct SequenceType {
sequence<int32> s; };
The following expressions are valid on a Topic of type SequenceType:
"s(1) = 5"
"s[1] = 5"