Use a Search Language

InsightIDR allows you to use several different languages while searching through your logs:

Operators

InsightIDR supports both logical and comparison operators, which allow you to create more complex searches. The guide below will introduce both sets of operators available to use while constructing a query.

Logical Operators

InsightIDR supports the following logical operators to create comprehensive search criteria. Please note that when constructing a Search Query all operators should be typed in UPPERCASE.

Logical Operator

Example

Description

"AND"

expr1 AND expr2

Returns log events that match both criteria

"OR"

expr1 OR expr2

Returns log events that match one or both criteria

“NOT"

expr1 NOT expr2

Returns log events that match expr1 but not expr2

Comparison Operators

Comparison operators can be used for Key Value Pairs (KVP) search and Regular Expression search.

Comparison Operator

Example

Description

=

field=value

Returns log events that match the search value – matches numeric and text values

!=

field!=value

Returns log events that do not match the search value – matches numeric and text values

>

field>num

Returns log events with field values higher than the search value

>=

field>=num

Returns log events with field values higher than or equal to the search value

<

field<num

Returns log events with field values lower than the search value

<=

field<=num

Returns log events with field values lower than or equal the search value

NOTE: Numerical values must be formatted as an integer, as a floating-point value, or in scientific notation to be properly recognized by InsightIDR. Units are not calculated as part of the comparison. For example, searching for a value<100bytes would not return a result with value=200bits

Keyword Search

Keyword search will work on all logs regardless of their format. Keyword searches are case sensitive by default and will match a full string until it is delimited by a non-letter character. For example:

1
Apr 13 20:01:01 hostname run-parts(/etc/cron.hourly)[26263]: starting 0anacron
2
Apr 13 20:01:01 hostname run-parts(/etc/cron.hourly)[26272]: finished 0anacron

InsightIDR will match the events by searching for “etc” or “run” because the text is delimited by whitespace and non-letter characters. InsightIDR would not match “hour” but will match “hourly."

Keyword search can be combined with logical operators. For example, “starting AND finished” would return both log events.

Note: When you list a series of keywords, InsightIDR automatically assumes an "AND" between each keyword. If you want to match an exact string, place “Double Quotes” around the search.

Regular Expression Keyword Search

Regular expressions can greatly enhance the power of your keyword searching. Regular expressions must be wrapped with two forward slashes (“/”). The two most common cases for regex search in InsightIDR are below.

Partial Matching

By default, regex search will return partial matches. The search below will match “complete,” “completely,” and “completed.”

1
where(/complete/)

Using a case-insensitive keyword search means that the search below will match Error, ERROR, error, and any other form of capitalization

1
where(/error/i)

Regular Expression Operators

Regular expressions use special characters to enable searching for more advanced patterns. These characters are *, +, ., \, [, ], (, ), {, }.If you need to use special characters as ordinary characters, you will need to escape them with a backward slash (\).

Regular Expression Field Extraction

Regex grouping and naming allows you to identify values in your log events and give these values a name, similar to having a key value pair in your log events. You can then use this named capture group to perform more complex search functions.

For example, what if you wanted to extract the IP address from the following raw log data?

1
<11>Mar 14 09:24:58 _hostname_ SSH: No User. Possible reasons: Invalid username, invalid license, error while accessing user database <SessionID=33711845, Listener=10.224.9.243:22, Client=13.91.103.73:1984, User=elasticsearch>

You can use this query to find the IP as a source_address: where(/Client=(?P<source_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) groupby(source_address) calculate(count).

Benefits

Regular Expression Field Extraction gives you the ability to identify key pieces of information in your logs which are not in a Key Value Format such that search functions can be applied to the values in your logs. By assigning a name to the identified value(s), these values can be used with our advanced search functions such as GroupBy() or for calculating values such as counts, sums, averages or unique instance counts. They can also be used for comparisons when creating alerts. This means you can create a type of Key Value Pairing out of non-Key Value Pair log formats.

  • Uses standard RE2 regex syntax for named capture groups.
  • Is not dependent on any log type or structure.
  • Removes requirement for data to be in KVP formats.
  • Can be used in queries and saved for creating dashboard items.

A regex named capture group is declared by using the following syntax in your expression:

1
(?P<name>regExp)

The result returned from the query to the right of the ‘>’ will be assigned the name enclosed in the ‘< >’.

Consider the sample log events below that contain a specific value such as ‘total sale’:

1
12:12:14 new sales event – customer Tim – total sale 24.45 – item blanket
2
12:12:15 new sales event – customer Tim – total sale 100.45 – item jacket
3
12:12:16 new sales event – customer Tim – total sale 1000.33 – item computer

The Regular expression to find the value following ‘total sale’ and assign this value to a named variable called ‘saleValue’ is:

1
where(/total sale (?P<saleValue>\d*)/)

Once you have captured the key you can then perform the full range of LEQL functions against it. For example:

1
where(/total sale (?P<saleValue>\d*)/) calculate(average:saleValue)

In the example, ‘saleValue’ is the name that the digits that follow ‘total sale’ will be assigned because of the regular expression. Once the digits are assigned to saleValue, then an Average calculation can be applied to these numbers.

Regular expression field extraction is extremely useful in this scenario because the value is not in a Key Value Format (KVP), making it hard to tell most systems what value to use. By using regex named capture group syntax, it is now easy to identify the value and assign it a name. This name is then used as part of the search query. It is also possible to save the query and use it for creating a dashboard item.

Advanced Capabilities

To learn more about our advanced regular expression search capabilities, please read the Regular Expression Log Search Documentation.

IP Search

InsightIDR supports classless inter-domain routing (CIDR) notation, which allows you to search for a range of IP addresses on your network without using complicated regular expressions. This means you can easily view the most active servers, users, and applications on your network.

Things to know about CIDR notation in InsightIDR

  • You can use this capability to search flow data generated by the Insight Network Sensor and any log data that contains IPv4 Addresses.
  • This requires a key=value search. IP() on its own does not work.
  • Allowed subnet values are /1 to /32.

In Log Search, enter a query with the following format:

  • Simple Search destination_address = IP(192.168.0.0/24)
  • Advanced Search where(destination_address = IP(192.168.0.0/24))

where

  • destination_address is the field in the log data you want to filter by
  • 192.168.0.0 is the IP address to use as the comparison
  • /24 is the block of addresses you want to search

The previous query would return any addresses in the range 192.168.0.1 to 192.168.0.254.

You can adjust the network range of your query by updating the subnet value. For example, replacing /24 with /16 would return any addresses in the range 192.168.0.1 to 192.168.255.254.

Key Value Pair and JSON Search

InsightIDR will automatically parse log events that are in a KVP or JSON format for easy use of advanced analytics. The KVP and JSON parsing documentation details the specific formats the system will parse. If your logs are not in a standard KVP or JSON format you can utilize the regular expression field extraction to gain access to the same advanced analytics. To understand which KVP and JSON formats

Given the log events below

Searches can be easily written to return important log events

Search for all log events with a response time over 25 to return the first two log events, such as:

1
where(repsonse_time>25)

You can then add the logical operator OR to include events from containerID 14 to return all three log events:

1
where(repsonse_time>25 OR containerID=14)

Regular Expression Field Extraction

If your logs do not contain any KVPs, you can designate a KVP relationship for a given string using Regular Expression Field Extraction. This will give you access to all of the advanced search, analytic, and visualization capabilities available for KVP and JSON log format.

Analytic Functions and Visualizations

With our powerful LEQL functions, you are able to produce queries that will easily visualize your data without any preprocessing required.

Count

Log search also supports returning a count of matched search results. Append calculate(COUNT) to your search query or press the calculate to get the number of search results. An example can be seen below.

1
where(status=500) calculate(COUNT)

Sum

You can use the Sum function to total the values of your name value pairs. If you had a KVP for sale_value and wanted to know the total sales for a specified time period you would use the following query:

1
where(sale_total>0) calculate(SUM:sale_total).

Average

The Average modifier is similar to the sum modifier, but it computes the mean of the values matching the search criteria. For instance, to get the average value of your sales, you might invoke a search like:

1
where(sale_total>0) calculate(AVERAGE:sale_total)

Count Unique

The Count Unique keyword returns an approximation of the number of unique values for a given key. It takes one parameter: the name of the key. For example, if you have the KVP userID in your log file and want to find the number of unique users, you should use the following query:

1
where(userID) calculate(UNIQUE:userID)

Min

The Min function will return the minimum value of the key for each time period. For example, the query below will return the shortest response time for each time period:

1
where(status=200) calculate(MIN:responseTime)

Max

The Max function will return the maximum value of the key for each time period. For example, the query below will return the longest response time for each time period:

1
where(status=200) calculate(Max:responseTime)

Groupby

Read more about the groupby function and how to use it.

Timeslice

InsightIDR will automatically calculate ten equal time intervals when performing a count, min, max or average query. You can manually set the number of time intervals by using the timeslice function. The valid input for timeslice is a number between 1 and 200 (inclusive). The query below used against a one hour search period would return the count of 500 errors per minute.

1
where(status=500) calculate(count) timeslice(60)

You can also set units of time: seconds, minutes, hours and days. The query below would return the count of 404 errors per half hour, regardless of the time frame entered.

1
where(status=500) calculate(count) timeslice(30m)

Percentile

The Percentile option now allows you to exclude outliers from your search functions. In simple mode, you can select either a 95th or 99th percentile search function based on a key value pair which has a numerical number. Users using advanced mode can specify their own percentile value by using percentile(80):key_value_pair in their calculate function.

Bytes

The Bytes option lets you calculate the size of your logs in byte form. This is useful for users who wish to verify the size of the logs that they have sent to their account. A simple query that would calculate the size of the given log would be where(/.*/) calculate(bytes)

Standard Deviation

The Standard Deviation option lets you calculate the standard deviation of a given series values. This is useful when trying to establish what values would be considered within normal variance to a given mean. An example use case for Standard Deviation is response times.

1
calculate(standarddeviation:service)

You can also use the keyword sd as a shortcut. For example: calculate(sd:service)