$regex¶
- $regex¶
The $regex operator provides regular expression capabilities for pattern matching strings in queries. MongoDB uses Perl compatible regular expressions (i.e. “PCRE.”)
You can specify regular expressions using regular expression objects (i.e. /pattern/ ) or using the $regex operator. The following examples are equivalent:
db.collection.find( { field: /acme.*corp/i } ); db.collection.find( { field: { $regex: 'acme.*corp', $options: 'i' } } );
These expressions match all documents in collection where the value of field matches the case-insensitive regular expression acme.*corp.
- $options¶
$regex provides four option flags:
i toggles case insensitivity, and allows all letters in the pattern to match upper and lower cases.
m toggles multiline regular expression. Without this option, all regular expression match within one line.
If there are no newline characters (e.g. \n) or no start/end of line construct, the m option has no effect.
x toggles an “extended” capability. When set, $regex ignores all white space characters unless escaped or included in a character class.
Additionally, it ignores characters between an un-escaped # character and the next new line, so that you may include comments in complicated patterns. This only applies to data characters; white space characters may never appear within special character sequences in a pattern.
The x option does not affect the handling of the VT character (i.e. code 11.)
s allows the dot (e.g. .) character to match all characters including newline characters.
Only the i and m options are available for the native JavaScript regular expression objects (e.g. /acme.*corp/i). To use x and s options, you must use the “$regex” operator with the “$options” syntax.
To combine a regular expression match with other operators, you need to use the “$regex” operator. For example:
db.collection.find( { field: { $regex: /acme.*corp/i, $nin: [ 'acmeblahcorp' ] } } );
This expression returns all instances of field in collection that match the case insensitive regular expression acme.*corp that don’t match acmeblahcorp.
Behavior¶
PCRE Matching Engine¶
$regex uses “Perl Compatible Regular Expressions” (PCRE) as the matching engine.
$in Expressions¶
To include a regular expression in an $in query expression, you can only use JavaScript regular expression objects (i.e. /pattern/ ). You cannot use $regex operator expressions inside an $in.
Index Use¶
If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.
A regular expression is a “prefix expression” if it starts with a caret (^) or a left anchor (\A), followed by a string of simple symbols. For example, the regex /^abc.*/ will be optimized by matching only against the values from the index that start with abc.
Additionally, while /^a/, /^a.*/, and /^a.*$/ match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however, /^a.*/, and /^a.*$/ are slower. /^a/ can stop scanning after matching the prefix.