Search Your Logs
- Once you configure Foundational Event Sources, go to the "Log Search" page from the InsightIDR homepage.
- Select one or more logs or log sets to search.
- Build a Query to search for specific information, or Use a Search Language to search for different pieces of information.
- If you're having trouble creating a query, you can recreate an Example Query.
- View your data in Visual mode and Add Visual Cards to help.
- You can Create Dashboards and Reports from your queries, or you can Export log data.
- Create an Alert from specific log indicators, such as server error codes.
Search Your Logs Using the Groupby Function
Groupby is a function that allows you to perform search functions based on grouping identical datasets. For example, the following function groups logs with the key “status” from the HTTP status code in a web server request: where(status) groupby(status) calculate(count)
This function has two modes of operation:
- Literal count, which counts unique events in log lines
- Statistical approximation, an approximation for more than 10,000 unique events
This function also allows you to change results in two ways:
Depending on the underlying data and how you build your Groupby query, some queries may not return any data, even if it is a valid query in Log Search.
If the number of unique events in the data set is less than 10,000, Groupby uses Literal Count. Literal Count fetches every log line in the data set and counts each occurrence of unique elements, while also putting identical events into a group together.
Use the following query for a count of unique elements:
source_address is the value you plan to group by.
When you are grouping by values such as unique identifiers, port numbers, or IP addresses, the number of unique values typically increases if you increase the time window.
Statistical approximation allows you to define a limit of how many groups are returned during Log Search, which is useful to prevent long-running queries on massive amounts of data.
For statistical approximation, the Groupby function groups identical log lines and events into data sets. Use the following formula to determine the minimum group size:
event_count / limit *2.
The resulting value is the minimum number of times an event has to occur in order for it to appear in the Groupby result.
event_count refers to the number of unique events, such the number of ports in firewall logs, or the number of IP addresses. The
limit refers to the
limit() parameter from the query, with maximum value of 10000, and a default value of 40 if you do not include the limit parameter.
A typical-use case for the Groupby function allows you to group firewall logs by source address in order to evaluate where most traffic is coming from. The query would be
For example, if you have 20,000 source IPs after running
calculate(unique:source_address) with a limit of 10000, the formula would be:
20000 / (10000*2), which produces a statistical approximation of
Therefore, the smallest bucket that will appear in your Groupby result will contain 10 items or more. Any event that falls below this threshold will not appear. If a particular
source_address only shows up five times in the selected log data, it will not be included in your Groupby result.
Additionally, if the number of unique events is too high, the Groupby function will not return any results.
For example, if there are 35,000 unique elements, and the limit is set to the default value of 40, the formula would be:
35,000 / (40*2), which produces a statistical approximation of
Therefore, any event that occurs fewer than 438 times would not be included in the Groupby result. You can lower this threshold by increasing the limit parameter.
If you are grouping by UUIDs or IP addresses, the result from your query might return an empty chart. This may happen when you try to group by a category where too few events occur to create a group, which will not create a visualization.
Increase Groupby Limit
You can increase the number of groups returned by your Groupby query with the
limit keyword by adding
limit(n) at the end of your query, where
n is a number between 1 and 10000.
Note, for very large number of groups the results are a no deterministic approximation.
The following query sets a limit of 350:
where(status) groupby(status) calculate(count) sort(desc) limit(350).
In the advanced mode, you can sort returned results in ascending or descending order using a query similar to the following:
where(status>=300) groupby(status) calculate(count) sort(desc).
You can use
descending as keywords to sort in descending order, or
ascending to sort in ascending order.
By default, Log Entry Query Language (LEQL) sorts results in a descending order if you do not use a sort keyword in your query.