The Collector

The Collector is a machine on your network that gathers security data from your firewall, active directory server, and other network data. While the Insight Agent is specifically meant to work with your individual assets, the Collector takes log information and normalizes it via configured Data Sources.

Normalization transforms log data from multiple diverse sources into a common JSON format and extracts standard information like hostnames, timestamps, error levels, etc. Normalization allows you to run more advanced queries on your endpoint logs and enhance your data visualization.

Note that with the collector, there can be a delay of up to 5 minutes for endpoint information to show up on InsightOps. You should consider the ‘Add Log’ workflow if real-time visibility of logs is a critical priority.


Technical Requirements

Before you can install the Rapid7 Collector service, you should verify that your system meets the following requirements.

System requirements

You can install the Collector service on a network server or virtual machine that meets the following requirements:

  • Operating system - Linux 64-bit or Windows 64-bit
  • Minimum Hardware - 4 GB RAM and 60 GB disk space
  • 2 CPU cores with 2GHz+ on each core

Only one Collector can be installed per machine on your network. Rapid7 strongly recommends that the machine (physical or virtual) is dedicated to running the Collector.

Account requirements

When setting up the Collector machine, you should be aware that:

  • InsightOps ingests data from existing sources in your environment. InsightOps needs administrator access to these sources.
  • While privileged accounts can be difficult to obtain due to internal controls, it is strongly recommended to obtain a Domain Admin service account for easier configuration and more accurate results in InsightOps.
  • Treat your Collector(s) as you would any other highly valuable asset. Credentials for data sources will be stored on this device.
  • Credentials are not stored in AWS. Raw unnecessary logs are stripped by the Collector in your environment so that sensitive data, such as personally identifiable information, medical records, etc., is not stored by Rapid7. Employee, organization, and asset names are also obfuscated in AWS.

Bandwidth requirements

  • Minimum network bandwidth - 100 Mbps network (recommended), 1000Mbps (strongly recommended)