Glob
DNSDB Search Glob Guide¶
Globbing is an advanced form of wildcard searches, more powerful than DNSDB's Standard Search left-hand or right-hand wildcards, but not as advanced as Farsight Compatible Regular Expressions (FCRE). They can be simpler to write, especially for API users who are not familiar with regular expressions.
In general, Farsight's glob implementation follows standard Unix glob(7) semantics, but not what's sometimes referred to as "extended globbing."
Glob searches are evaluated against the DNS master file form of the hostnames (aka rrnames) and rdata values, which by design contains only printable ASCII characters. All non-printable characters, including octets outside the ASCII range, are converted to escape sequences in the form \DDD (backslash followed by three decimal digits) per RFC 1035. This is only applicable to RData (RHS) queries.
Glob Syntax¶
A glob is a string of printable characters with the following characters given special meaning:
*-- Match any zero or more characters.?-- Match exactly any one character.[-- Begin a character class. Any of the contained characters or ranges will match.]-- End a character class.\-- Escape the next character (but not within a character class)
Any other characters in globbing pattern get matched exactly as written, except that characters are not case sensitive.
Character Class Syntax¶
A character class is a set of characters enclosed between an opening [ and a closing ]. A simple example is [m-z1-3] to match characters m through z and 1 to 3.
Within the character class, the following characters are handled specially:
!-- If the first character after the opening[, denotes a negated character class, i.e. a class which matches any character not listed in the remainder of the class.]-- If the first character after the opening[or[!, encodes a literal]as a member of the class. A]after the first character after the opening[or[!ends the character class.--- If the first character after the opening[or[!or the last character before the closing], encodes a literal-as a member of the character class.- If between two characters A and B, encodes the range of characters between A and B, inclusive, as members of the character class. The character A must occur before B in ASCII encoding.
The sequences [. and [= are not allowed between the opening [or [! and the closing ], to prevent confusion with unsupported POSIX collation sequences and collation classes.
If the sequence [: appears in a character class, it must be the beginning of one of the following POSIX character classes:
[:alnum:](POSIX character class) -- Alphanumeric characters 0-9, A-Z, and a-z[:alpha:](POSIX character class) -- Alphabetic characters A-Z, a-z[:blank:](POSIX character class) -- Blank characters (space and tab)- Only printable characters occur in searchable strings and space is the only printable whitespace character, thus use of
[:blank:]is equivalent to a space character. - Tabs in data appear as the escape sequence
\009and can be matched with the glob pattern\\009. [:cntrl:](POSIX character class) -- Control characters- Only printable characters occur in searchable strings, so
[:cntrl:]will not match any characters. - Control characters in data will appear as escape sequences in the form
\DDD(backslash followed by three digits). To match one of those, you need to escape the backslash with another backslash. For example, to match the literal string\004, use the glob pattern\\004. [:digit:](POSIX character class) -- Decimal digits 0-9[:graph:](POSIX character class) -- Any printable character other than space.- Only printable characters occur in searchable strings, thus a character class containing
[:graph:]is equivalent to[! ](negated character class containing only a space). [:lower:](POSIX character class) -- Lower case alphabetic characters a-z- Hostnames will be folded to lower case, thus use of
[:lower:]is equivalent to[:alpha:]. [:print:](POSIX character class) -- Any printable character- Only printable characters occur in searchable strings, so
[:print:]will match any character. [:punct:](POSIX character class) -- Punctuation characters (printable characters other than space and[:alnum:])[:space:](POSIX character class) -- Any whitespace character- The space character is the only printable whitespace character, thus use of
[:space:]is equivalent to a space character. [:upper:](POSIX character class) -- Upper case alphabetic characters A-Z- Since all of our data is indexed as lower-case, this is not useful as it is equivalent to
[:lower:]. [:xdigit:](POSIX character class) -- Hexadecimal digits 0-9, a-f, A-F
The above named character classes must appear inside an enclosing [ and ], e.g. [[:digit:][:punct:]] to match a digit or punctuation character. Without the enclosing braces, [:digit:] will match the characters :, d, i, g, or t.
Neither the above character classes nor a character range may begin or end a character range. For example, the character class expressions [0-[:alpha:]] and [a-n-z] are invalid.
All other characters between the opening [ or [! and the closing ] are added to the character class, including the backslash \ character.
There is no way to express a character class containing a single ! character.
Important notes¶
- Glob searches are not case sensitive.
- Globbing patterns are "anchored" front and back by default. (This is a major difference from FCRE.)
- All hostnames (rrnames) in the DNS dataset end in a
., which must be accounted for in globs. - Therefore, a search for
*.comwill not match any hostnames. A glob that searches in rrnames must end in something that matches a., so*.com.would match what was intended. - All well-formed rdata we currently index in the DNS dataset ends in a
.or a", which should be accounted for in globs. - Therefore, a glob that searches in rdata should end in something that matches a
.or a". - There must be at least two consecutive non-wildcard characters in the pattern. The implicit front and back anchor counts as a non-wildcard character.
Examples¶
- To match hostnames with a label containing the word "smoke":
- Glob pattern:
*smoke* - Search type: rrnames search
-
Example results:
- smokeping.pdf.ac.
- smoke.tesla.ac.
-
To match hostnames with a label containing the word "cider" but not containing "hard":
- Glob pattern:
*cider* - Search type: rrnames search with exclude filter
*hard* -
Example results:
- ciderpress.ca.
- colombus.citycider2018.eventbrite.ca.
-
To match hostnames with a label ending in "www." and a later label starting with ".com":
- Glob pattern:
*www.*.com* - Search type: rrnames search
-
Example results:
- www.example.com.
- dev-www.subdomain.example.com.
- www.example.com.cdn.net.
- stage-www.dev.community.org.
-
To match hostnames starting with "www." and ending in ".com.":
- Glob pattern:
www.*.com. - Search type: rrnames search
-
Example results:
- www.example.com.
- www.subdomain.example.com.
-
To match hostnames starting with "www." and ending with ".com" with no other dots in between:
-
This cannot be done in a general way using globs; use regular expression instead.
-
To match hostnames starting with "www" optionally preceded by a "dev-" or "stage-" prefix in a .net or .edu domain:
-
This cannot be done in a general way using globs; use regular expression instead.
-
To match TXT records encoding an SPF policy with a ~all default:
- Glob pattern:
"v=spf1 * ~all" - Search type: rdata search
-
Example results:
- "v=spf1 a mx ~all"
- "v=spf1 a 10.2.0.0/16 ~all"
-
To match single character domain names (which are really two character domain names when you add the implicit trailing '.'):
- Glob pattern:
?. - Search type: rrnames or rdata search
-
Example results:
- a.
- 0.
-
To match "bri" followed by exactly any three characters followed by "morning" followed by anything (or nothing):
- Glob pattern:
bri???morning* - Search type: rrnames search
- Note: A question mark matches exactly one character
-
Example results:
- brightmorning.com
- brightmorningtoday.com
-
To match "ns" followed by any single digit followed by anything (or nothing) and ending in ".net.":
- Glob pattern:
ns[0-9]*.net. - Search type: rrnames search
- Example results:
- ns0.fsi.net
- ns0abc.fsi.net