Regular Expression
What is RegEx?
Regular expressions (RegEx) can be used independently or with any of the search functionality in the basic search documentation to provide advanced capability, and can greatly enhance the power of your keyword searching. A RegEx search is placed within two slashes (“/”).
Why Use RegEx?
Regular expressions have powerful capabilities for searching complex data and custom log formats
- Unlike Keyword searches, RegEx can perform partial and case-insensitive matching.
- RegEx search can be used to search events with special characters like slashes (/), double quotes (")
- RegEx can be combined with KVP and JSON searching to match fields with variable values. *For example, if you were sending your Apache Access Logs to InsightOps in JSON format, you could use the remoteIP key to find any values that represent ips from the country of Guinea (All ips in Guinea are in the range 197.149.192.0 – 197.149.225.225)
1remoteIP = /"197\.149\.[12][0-59][0-9]\.[0-2].*\/NX"/
Keyword Search
Keyword search will work on all logs regardless of their format. Keyword searches are case sensitive by default and will match a full string until it is delimited by a non-letter character. For example, given the log events below.
1Apr 13 20:01:01 hostname run-parts(/etc/cron.hourly)[26263]: starting 0anacron2Apr 13 20:01:01 hostname run-parts(/etc/cron.hourly)[26272]: finished 0anacron
InsightOps will match the events by searching for “etc” or “run” because the text is delimited by whitespace and non-letter characters. InsightOps would not match “hour” but will match “hourly”.
Keyword search can be combined with logical operators, e.g. “starting AND finished” would return both log events.
Note: When you list a series of keywords InsightOps automatically assumes an AND between each keyword. If you want to match an exact string place “Double Quotes” around the search.
Using RegEx Keyword Matching
Partial Matching
By default, regex search will return partial matches. The search below will match “complete,” “completely,” and “completed”
1where(/complete/)
Case Insensitive Search
Using a case-insensitive keyword search means that the search below will match Error, ERROR, error, and any other form of capitalization
1where(/error/i)
Building With Regular Express Operations
Regular expressions use special characters to enable searching for more advanced patterns. These characters are *
, +
, .
, \
, [
, ]
, (
, )
, {
, }
, ^
, $
If you need to use special characters as ordinary characters, you have to escape them with a backward slash .
Matching something a number of times
Name | Example | Description |
---|---|---|
Any | a* | Star will match zero or more of the previous character |
At least one | a+ | Plus matches at least one repetition of the previous character |
Exactly | a | Matches the exact number of the previous character |
From, to | a{x,y} | Matches the number of the previous character within the range. |
Up to | a{,y} | Matches up to the limit of the previous character |
At least | a{x,} | Matches at least the limit of the previous character |
Match character set
Name | Example | Description |
---|---|---|
Any character | . | Dot matches any single character |
Any digit | \d | Matches a digit character that is 0-9 |
Any whitespace | \s | Matches any whitespace character |
Anything but a digit | \D | Matches any character that is not a digit |
Anything but a whitespace | \S | Matches any character except for whitespace |
Given set | [abc] | Matches any of the characters specified |
Anything but the given set | [^abc] | Matches any character except for those specified |
Regular Expression Flags
Flags can change the default behavior of a RegEx search. They are specified at the end of a RegEx search after the close slash “/”
- /i case-insensitive: disables case sensitivity, default operation is to be case sensitive
- /m multiline: enables the special characters for start (^) and end ($) to match individual lines of a multiline log event, default operation is to only match the start and end of the log event
- /s New lines: enables the special character (.) to match new lines
- /U ungreedy: By default RegEx is greedy, i.e. the search tries to match the maximum number of characters. This causes the search to be ungreedy, i.e. it tries to match the fewest number of characters that satisfy the search parameters
Regular Expression Examples
InsightOps RegEx can be used independently or with any of the search functionality in the basic search documentation to provide advanced capability. A RegEx search is placed within two slashes (“/”) and can include optional flags such as “i”.
Name | Description |
---|---|
/Null/ | Events that contains Null, such as NullPointerException |
/error/i | Events that contains error, case insensitive, such as Error, ERROR |
/Exception “.*” at/ | Events that contains exception trace with a name |
/20[01]/ | Events that contains 200 or 201 |
ab*c | Matches strings ac, abc, abbc |
ab+c | Matches strings abc, abbc, but not ac |
ab{2}c | Matches abbc, but not abc or abbbc. |
ab{1,3}c | Matches abc, abbc, and abbbc |
ab{,2}c | Matches ac, abc, and abbc |
ab{2,}c | Matches abbc, abbbc, but not abc |
a.c | Matches strings abc, acc, adc, but not ac. |
a\d | Matches a0, a1, a2 |
a\sb | Matches a b. |
a\D | Matches strings ab, ac, but not a0 |
a\Sc | Matches strings abc, a0c, but not a c |
/completed/i | Matches strings completed as well as Completed, compLeted, and ComplE |
field=/regexp/ | Field’s value matches the regular expression |
field!=/regexp/ | Field’s value does not match the regular expression |
Regular Expression Field Extraction
Regex grouping and naming allows you to identify values in your log events and give these values a name, similar to having a Key value pair in your log events. You can then use this named capture group to perform more complex search functions.
Benefits
This gives you the ability to identify key pieces of information in your logs which are not in a Key Value Format such that search functions can be applied to the values in your logs. By assigning a name to the identified value(s), these values can be used with our advanced search functions such as GroupBy() or for calculating values such as counts, sums, averages or unique instance counts. They can also be used for comparisons when creating alerts. This means you can create a type of Key Value Pairing out of non-Key Value Pair log formats.
- Uses standard RE2 regex syntax for named capture groups
- Is not dependant on any log type or structure
- Removes requirement for data to be in KVP formats
- Can be used in queries and saved for creating dashboard items.
Named Capture Groups
A RegEx named capture group is declared by using the following syntax in your expression:
1(?P<name>regExp)
The result returned from the query to the right of the ‘>’ will be assigned the name enclosed in the ‘< >’
Consider the sample log events below that contain a specific value such as ‘total sale’:
112:12:14 new sales event – customer Tim – total sale 24.45 – item blanket212:12:15 new sales event – customer Tim – total sale 100.45 – item jacket312:12:16 new sales event – customer Tim – total sale 1000.33 – item computer
The Regular Expression to find the value following ‘total sale’ and assign this value to a named variable called ‘saleValue’ is:
1where(/total sale (?P<saleValue>\d*)/)
Once you have captured the key you can then perform the full range of LEQL functions against it, for example:
1where(/total sale (?P<saleValue>\d*)/) calculate(average:saleValue)
In the example, ‘saleValue’ is the name that the digits that follow ‘total sale’ will be assigned because of the regular expression. Once assigned to saleValue, then an Average calculation can be applied to these numbers.
Regular expression field extraction is extremely useful in this scenario because the value is not in a Key Value Format (KVP), making it hard to tell most systems what value to use. By using Regex named capture group syntax, it is now easy to identify the value and assign it a name. This name is then used as part of the search query. It is also possible to save the query and then use it for creating a dashboard item.