Collector Requirements

⚠️

The machine with the collector software installed acts as a server

Before you install a collector ensure that the machine with collector software is a server. Its intended use is collecting data for the Command Platform (Insight Platform) and it should not be used for any other purpose.

To set up a collector, there are requirements that must be met. If you do not meet these requirements before attempting to set up a collector, it might not operate properly. Read all the sections and understand their importance to determine if deploying a collector is right for your organization.

If you already have Nexpose or Vulnerability Management (InsightVM) installed on your system, do not install the Insight Collector on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.

System and host requirements
Networking requirements
Data collection requirements
Endpoint data requirements
Collector placement and sizing
Important collector limitations

System and host requirements

You can install a Collector on a network server or virtual machine that meets the following requirements.

Minimum hardware:

4 CPU cores with 2GHz+ on each core
8 GB RAM recommended
60 GB+ available disk space
Configured with a Fully Qualified Domain Name (FQDN) such as idrcollector23.myorg.com

⚠️

Deploying the collector on ARM architecture, such as AWS Graviton, is not currently supported.

Read more about Collector Placement and Sizing.

Disk space

In some situations, a collector cannot establish a connection with the cloud and becomes unable to send data to the Command Platform (Insight Platform). Collector disk space allows it to temporarily store the data by writing logs to the disk until a connection is reestablished. If more disk space is available, your collector can store data for longer without a connection.

Because the Command Platform (Insight Platform) compresses the data it receives, Rapid7 recommends 1GB of disk space for every 10GB of data in the collector. Additionally, plan for at least 24 hours of spillover disk space for each collector when data cannot reach the cloud.

Supported operating systems

Refer to these tables to view the OS versions that the Collector currently supports and the End-Of-Life (EOL) schedule for each.

Microsoft Windows Server

Name	EOL for Collector support
Windows Server 2022	Oct 14, 2031
Windows Server 2019	Jan 9, 2029
Windows Server 2016	Jan 11, 2027

Linux

Distribution	Architecture	EOL for Collector support
Amazon Linux 2023	x86-64	Jun 30, 2029
Debian 11	x86-64	Jun 1, 2026
Debian 10	x86-64	Jun 1, 2024
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.1	x86-64	May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.0	x86-64	May 31, 2034
Fedora 37	x86-64	Dec 12, 2023
Fedora 36	x86-64	May 16, 2023
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.1	x86-64	May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.0	x86-64	May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.7	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.6	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.5	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.4	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.3	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.2	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.1	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.0	x86-64	May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 7.0-7.9	x86-64	Jun 30, 2025
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 6.0-6.10	x86-64	Jun 30, 2024
SUSE Linux Enterprise Server 15	x86-64	Jul 31, 2028
SUSE Linux Enterprise Server 12	x86-64	Oct 31, 2024
Ubuntu 22.04 (Standard Support)	x86-64	Apr 2027
Ubuntu 22.04 (Pro Support)	x86-64	Apr 2032
Ubuntu 20.04	x86-64	Apr 2, 2030
Ubuntu 18.04 (Pro Support)	x86-64	Apr 2, 2028
Ubuntu 16.04	x86-64	Apr 2, 2026
Ubuntu 14.04	x86-64	Apr 2, 2024

End of life operating systems

Refer to these tables to view the OS versions that have reached end of life (EOL).

Microsoft Windows Server

Name	EOL for Collector support
Windows Server 2012	Oct 10, 2023

Linux

Distribution	Architecture	EOL for Collector support
Amazon Linux 2	x86-64	Jun 30, 2023
Amazon Linux 1	x86-64	Jun 30, 2023
Fedora 37	x86-64	Dec 12, 2023
Fedora 36	x86-64	May 16, 2023
SUSE Linux Enterprise Desktop 15 SP3	x86-64	Jan 31, 2023
openSUSE LEAP 15.4	x86-64	Nov 30, 2023
Ubuntu 22.10	x86-64	May 2, 2023

⚠️

PowerShell process for Windows OS

On Windows systems, the Collector must be capable of launching the PowerShell process locally in order to auto-configure event sources.

Supported browsers:

Mozilla Firefox (latest stable release)
Google Chrome (latest stable release)

Minimum network bandwidth:

100Mbps network (required)
1000Mbps (preferred)

Other recommendations:

Only one Collector can be installed for each machine on your network. Rapid7 strongly recommends that the machine (physical or virtual) is dedicated to running the Collector.

❌

Conflict with Insight Collector and pre-existing Nexpose software

If you already have Nexpose installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.

Networking requirements

As you prepare your network for the Collector, consider these areas:

Internal routing rules
Ports
IP ranges
Credentials
Firewall rules

❌

SSL Decryption Exclusion

The Collector, as well as agents that use Collectors as a proxy to the Command Platform (Insight Platform), will not work if your organization decrypts SSL traffic using Deep Packet Inspection technologies such as transparent proxies.

Internal routing rules

The Collector polls and receives data from event sources, so you should provide the directory or file location where the collector can access the server logs for collecting log data. You can specify a local folder path or a Windows Universal Naming Convention (UNC) path to a hosted network drive.

Ports

A Collector requires each stream of syslog logs to be sent to it on a unique TCP or UDP port.

You will need to configure each device that will send logs using syslog to send the logs over a TCP or UDP port that is unique on that collector. It is common to start sending the logs using port 10000, although you may use any open unique port.

All Collectors must be able to reach out to port 443 and communicate back to the Collector via TCP ports:

Port Number	Data Gathered
TCP 5508	Communication back to the Collector from the Rapid7 Agent (Insight Agent) or Endpoint Monitor.
TCP 6608	Upgrade agent data path for the Rapid7 Agent (Insight Agent).
TCP and UDP 8037	File upload for Rapid7 Agent (Insight Agent).
TCP 20,000 - 30,000	Communication back to the Collector from the Endpoint Monitor.

⚠️

Linux collector requirements

For Linux collectors, you must use ports higher than 1024.

Read our product documentation on ports used by SIEM (InsightIDR) for more information.

IP ranges

Overlapping endpoint monitoring ranges are allowed. IP addresses or IP ranges defined on Collector A should not be duplicated on Collector B. If this exists, it should be updated before the migration. Otherwise, those ranges will have to be manually updated after the migration.

See IP Addresses for more information.

Credentials

Each collector can support only one set of endpoint monitoring credentials. Ensure you configure credentials for each collector instance on your network.

Firewall rules

Disable the local firewall on the collector host, if possible. For specific instructions, read our product documentation on firewall rules.

If you cannot disable the local firewall, follow these configurations.

All Collectors must be able to establish outbound connectivity on port 443 to *.endpoint.ingress.rapid7.com and communicate with the domains shown in the Data and Storage (S3) columns of the following table according to your geographic region. For example, for SIEM subscribers that elect to store their data in Australia, Collectors must be able to communicate with the following endpoints using port 443:

*.endpoint.ingress.rapid7.com
au.data.insight.rapid7.com
s3-ap-southeast-2.amazonaws.com

Region	Data endpoint	Storage (S3) endpoint	Storage (S3) upgrade location
United States - 1	`data.insight.rapid7.com`	`s3.amazonaws.com`	`storage.endpoint.ingress.rapid7.com`
United States - 2	`us2.data.insight.rapid7.com`	`s3.us-east-2.amazonaws.com`	`us2.storage.endpoint.ingress.rapid7.com`
United States - 3	`us3.data.insight.rapid7.com`	`s3.us-west-2.amazonaws.com`	`us3.storage.endpoint.ingress.rapid7.com`
Canada	`ca.data.insight.rapid7.com`	`s3.ca-central-1.amazonaws.com`	`ca.storage.endpoint.ingress.rapid7.com`
Europe	`eu.data.insight.rapid7.com`	`s3.eu-central-1.amazonaws.com`	`eu.storage.endpoint.ingress.rapid7.com`
Japan	`ap.data.insight.rapid7.com`	`s3-ap-northeast-1.amazonaws.com`	`ap.storage.endpoint.ingress.rapid7.com`
Australia	`au.data.insight.rapid7.com`	`s3-ap-southeast-2.amazonaws.com`	`au.storage.endpoint.ingress.rapid7.com`

If you intend to deploy token-based Rapid7 Agents (Insight Agents) through your Collectors, you also need to allow outbound connectivity from each Collector on port 443 to the endpoint that provides the agent’s configuration files. Just like the Data and Storage endpoints in the previous table, you can configure your firewall rules to allow your Collectors to connect to a region-specific version of the Deployment endpoint to meet this requirement:

Region	Deployment endpoint
United States - 1	`us.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
United States - 2	`us2.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
United States - 3	`us3.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
Canada	`ca.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
Europe	`eu.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
Japan	`ap.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`
Australia	`au.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files`

Data collection requirements

To plan your Collector deployment, have the following information available for each server or virtual machine where you will install the Collector:

Display name
Network location
Server hostname and IP address
Administrator rights to install a service on the server

Endpoint data requirements

The collection of endpoint data also uses resources on the Collector. Endpoint data can be collected either by using the Collector to scan a range of endpoints or by installing a Rapid7 Rapid7 Agent (Insight Agent) on the endpoints. Both methods will use resources on the Collector.

The greater the number of endpoints that the Collector needs to collect data from, the more resources it will need. If the CPU utilization is already consistently hovering at 40% or higher on the Collector, you should consider standing up another Collector at that location or adding more CPUs before adding additional endpoint ranges to scan or agents.

The Rapid7 Collector cannot have more endpoints or agents than 600 per CPU. Therefore, if your Collector has 4 CPU cores, it can handle up to 2,400 endpoints or agents if the CPU utilization is not already heavily utilized by event sources that have been added.

The number of event sources and the number of endpoints from which you are collecting data determine how much RAM and the number of CPUs the Collector needs. The more event sources and endpoints used for data collection, the more RAM and CPU the Collector will need to operate. The free disk space that the Collector has is used for the spillover of data collection only. Under normal circumstances, the Collector sends all data collected immediately to the cloud for processing.

However, if the collector loses connectivity to the cloud or it is under other subnormal operating scenarios, it will store collected data in a spillover folder on its hard drive. The more free disk space you give the collector, the more spillover space it will have available to it. It is often more efficient to deploy multiple collectors throughout the environment rather than break firewall rules or overload a single collector.

Also, when scanning endpoints with a collector, each collector can be configured with only one set of credentials for the endpoint scanning. If different credentials are required for scanning endpoints, then you will need to use a separate collector for each credential that will be used.

Collector placement and sizing

When considering where to place your Collectors, keep in mind that your bandwidth and network architecture will influence the number of Collectors that you need in your organization and where you should place them. Generally, you should deploy the Collectors close to the logs that will be pulled or sent and close to the endpoints that they will be scanning.

Rapid7 recommends a maximum of 80 event sources for each Collector, depending on the following:

The size of the event sources being added
The amount of CPU memory available to the Collector
The amount of VM resources available to the Collector
The amount of disk space available to the Collector

ℹ️

Event source distribution for collectors

The capacity of a collector depends on multiple factors. While the maximum recommended is 80 event sources for each collector, it can be more convenient to keep up to 50-60 event sources per collector to prevent data collection issues.

Distributing event sources over multiple collectors is always a good practice.

Collector Location Size	Number of Endpoints/Agents	Number of Event Sources on the Collector	Recommended Minimum CPU	Recommended Minimum RAM	Recommended Minimum Disk Space
Small	Up to 500	1 - 10	4	8 GB	60 GB
Medium	Up to 2,400	10 - 50	4	8 GB	80 GB
Large	Up to 600 per CPU core	50 - 80*	4+	16 GB	100 GB

*If you have more than 80 event sources, you should split your event sources across multiple collectors.

High-volume event sources place a higher RAM and CPU load on the collector and will result in the collector handling a lower number of event sources overall. Before adding a chatty event source like a firewall to the collector, check its current resource utilization (under Data Collection > Collectors).

If the CPU utilization is consistently more than 40%, consider adding another collector to the location to handle some of the event sources.
If the CPU utilization is consistently above 90%, then you need an additional collector to handle the load.

Important collector limitations

All collectors must be configured with a fully qualified domain name, for example, idrcollector1.myorg.com.

For endpoint scanning, a collector can be configured with only one endpoint scanning credential. Therefore, if you have multiple domains or other requirements for separate credentials that need to be used for scanning different endpoint ranges, you should plan on a separate collector for each domain or set of credentials.

A collector installed on Linux has a limitation to the number of agents that it can support due to default file descriptor settings. For most Linux systems, the default agent limit is 2000 agents. To increase the number of agents that can connect to a Linux Collector, change the number of file descriptors to be twice the number of agents that you want the collector to handle. More information on the file descriptor settings can be found here: https://www.tecmint.com/increase-set-open-file-limits-in-linux

If you already have Nexpose or Vulnerability Management (InsightVM) installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine as this will cause issues with your Nexpose systems.