Collector Overview
The Collector is the on-premises component of InsightIDR, or a machine on your network running Rapid7 software that either polls data or receives data from Event Sources and makes it available for InsightIDR analysis. An Event Source represents a single device that sends logs to the Collector.
For example, if you have three firewalls, you will have one Event Source for each firewall in the Collector.
It is usually more efficient to deploy multiple Collectors throughout an environment rather than break firewall rules or overload a single Collector.
You may need to distribute the bandwidth across your network if you have very high logging levels or if your network is geographically dispersed.
Advantages of the Collector
The Collector workflow has two main advantages over sending logs to InsightIDR directly: normalization and user attribution.
Normalization
Normalization transforms log data from multiple diverse sources into a common JSON format and extracts standard information such as hostnames, timestamps, and error levels. Normalization allows you to run more advanced queries on your endpoint logs and enhance your data visualization.
User Attribution
User attribution correlates endpoint activity to individual users using that endpoint while logged into applications. Attribution provides a fuller image of your security posture because user accounts are the most common targets for sophisticated attacks.
If you decide to use the collector, there can be a delay of up to 5 minutes for endpoint information to show up on InsightIDR. You should consider Custom Logs if real-time visibility of logs is a critical priority.
In order for InsightIDR to apply user attribution, the event source must be supported. InsightIDR must also have reliable data to recognize the asset by IP address and the user by the user field in the log data. These are often achieved by the Insight Agent and a DHCP event source.
Account Requirements
When setting up the Collector, you should be aware that:
- InsightIDR ingests data from existing sources in your environment. InsightIDR needs administrator access to pull data from these sources or push data to log aggregators from a Domain Admin account, if possible.
- You should treat your Collector(s) as you would any other valuable asset, as it stores credentials from your event sources.
- InsightIDR normalizes and attributes data on AWS but does not store credentials. The Collector strips raw, unnecessary logs in your environment to prevent storage of sensitive data, such as personally identifiable information, medical records, and employee, organization, or asset names.