Collector Requirements

The machine with the collector software installed acts as a server

Before you install a collector ensure that the machine with collector software is a server. Its intended use is collecting data for the Insight Platform and it should not be used for any other purpose.

To set up a collector, there are requirements that must be met. If you do not meet these requirements before attempting to set up a collector, it might not operate properly. Read all the sections and understand their importance to determine if deploying a collector is right for your organization.

If you already have Nexpose or InsightVM installed on your system, do not install the Insight Collector on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.

System and host requirements

You can install a Collector on a network server or virtual machine that meets the following requirements.

Minimum hardware:

  • 4 CPU cores with 2GHz+ on each core
  • 8 GB RAM recommended
  • 60 GB+ available disk space
  • Configured with a Fully Qualified Domain Name (FQDN) such as idrcollector23.myorg.com

Deploying the collector on ARM architecture, such as AWS Graviton, is not currently supported.

Read more about Collector Placement and Sizing.

Disk space

In some situations, a collector cannot establish a connection with the cloud and becomes unable to send data to the Insight Platform. Collector disk space allows it to temporarily store the data by writing logs to the disk until a connection is reestablished. If more disk space is available, your collector can store data for longer without a connection.

Because the Insight Platform compresses the data it receives, Rapid7 recommends 1GB of disk space for every 10GB of data in the collector. Additionally, plan for at least 24 hours of spillover disk space for each collector when data cannot reach the cloud.

Supported operating systems

Refer to these tables to view the OS versions that the Collector currently supports and the End-Of-Life (EOL) schedule for each.

Microsoft Windows Server
NameEOL for Collector support
Windows Server 2022Oct 14, 2031
Windows Server 2019Jan 9, 2029
Windows Server 2016Jan 11, 2027
Linux
DistributionArchitectureEOL for Collector support
Debian 11x86-64Jun 1, 2026
Debian 10x86-64Jun 1, 2024
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.1x86-64May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.0x86-64May 31, 2034
Fedora 37x86-64Dec 12, 2023
Fedora 36x86-64May 16, 2023
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.1x86-64May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.0x86-64May 31, 2034
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.7x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.6x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.5x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.4x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.3x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.2x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.1x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.0x86-64May 31, 2031
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 7.0-7.9x86-64Jun 30, 2025
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 6.0-6.10x86-64Jun 30, 2024
SUSE Linux Enterprise Server 15x86-64Jul 31, 2028
SUSE Linux Enterprise Server 12x86-64Oct 31, 2024
Ubuntu 22.04 (Standard Support)x86-64Apr 2027
Ubuntu 22.04 (Pro Support)x86-64Apr 2032
Ubuntu 20.04x86-64Apr 2, 2030
Ubuntu 18.04 (Pro Support)x86-64Apr 2, 2028
Ubuntu 16.04x86-64Apr 2, 2026
Ubuntu 14.04x86-64Apr 2, 2024

End of life operating systems

Refer to these tables to view the OS versions that have reached end of life (EOL).

Microsoft Windows Server
NameEOL for Collector support
Windows Server 2012Oct 10, 2023
Linux
DistributionArchitectureEOL for Collector support
Amazon Linux 2x86-64Jun 30, 2023
Amazon Linux 1x86-64Jun 30, 2023
Fedora 37x86-64Dec 12, 2023
Fedora 36x86-64May 16, 2023
SUSE Linux Enterprise Desktop 15 SP3x86-64Jan 31, 2023
openSUSE LEAP 15.4x86-64Nov 30, 2023
Ubuntu 22.10x86-64May 2, 2023

PowerShell process for Windows OS

On Windows systems, the Collector must be capable of launching the PowerShell process locally in order to auto-configure event sources.

Supported browsers:

  • Mozilla Firefox (latest stable release)
  • Google Chrome (latest stable release)

Minimum network bandwidth:

  • 100Mbps network (required)
  • 1000Mbps (preferred)

Other recommendations:

  • Only one Collector can be installed for each machine on your network. Rapid7 strongly recommends that the machine (physical or virtual) is dedicated to running the Collector.

Conflict with Insight Collector and pre-existing Nexpose software

If you already have Nexpose installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.

Networking requirements

As you prepare your network for the Collector, consider these areas:

SSL Decryption Exclusion

The Collector, as well as agents that use Collectors as a proxy to the Insight Platform, will not work if your organization decrypts SSL traffic using Deep Packet Inspection technologies such as transparent proxies.

Internal routing rules

The Collector polls and receives data from event sources, so you should provide the directory or file location where the collector can access the server logs for collecting log data. You can specify a local folder path or a Windows Universal Naming Convention (UNC) path to a hosted network drive.

Ports

A Collector requires each stream of syslog logs to be sent to it on a unique TCP or UDP port.

You will need to configure each device that will send logs using syslog to send the logs over a TCP or UDP port that is unique on that collector. It is common to start sending the logs using port 10000, although you may use any open unique port.

All Collectors must be able to reach out to port 443 and communicate back to the Collector via TCP ports:

Port Number

Data Gathered

TCP 5508

Communication back to the Collector from the Insight Agent or Endpoint Monitor.

TCP 6608

Upgrade agent data path for the Insight Agent.

TCP and UDP 8037

File upload for Insight Agent.

TCP 20,000 - 30,000

Communication back to the Collector from the Endpoint Monitor.

Linux collector requirements

For Linux collectors, you must use ports higher than 1024.

Read our product documentation on ports used by InsightIDR for more information.

IP ranges

Overlapping endpoint monitoring ranges are allowed. IP addresses or IP ranges defined on Collector A should not be duplicated on Collector B. If this exists, it should be updated before the migration. Otherwise, those ranges will have to be manually updated after the migration.

See IP Addresses for more information.

Credentials

Each collector can support only one set of endpoint monitoring credentials. Ensure you configure credentials for each collector instance on your network.

Firewall rules

Disable the local firewall on the collector host, if possible. For specific instructions, read our product documentation on firewall rules.

If you cannot disable the local firewall, follow these configurations.

All Collectors must be able to establish outbound connectivity on port 443 to *.endpoint.ingress.rapid7.com and communicate with the domains shown in the Data and Storage (S3) columns of the following table according to your geographic region. For example, for InsightIDR subscribers that elect to store their data in Australia, Collectors must be able to communicate with the following endpoints using port 443:

  • *.endpoint.ingress.rapid7.com
  • au.data.insight.rapid7.com
  • s3-ap-southeast-2.amazonaws.com
RegionData endpointStorage (S3 endpoint)
United States - 1data.insight.rapid7.coms3.amazonaws.com
United States - 2us2.data.insight.rapid7.coms3.us-east-2.amazonaws.com
United States - 3us3.data.insight.rapid7.coms3.us-west-2.amazonaws.com
Canadaca.data.insight.rapid7.coms3.ca-central-1.amazonaws.com
Europeeu.data.insight.rapid7.coms3.eu-central-1.amazonaws.com
Japanap.data.insight.rapid7.coms3-ap-northeast-1.amazonaws.com
Australiaau.data.insight.rapid7.coms3-ap-southeast-2.amazonaws.com

If you intend to deploy token-based Insight Agents through your Collectors, you also need to allow outbound connectivity from each Collector on port 443 to the endpoint that provides the agent's configuration files. Just like the Data and Storage endpoints in the previous table, you can configure your firewall rules to allow your Collectors to connect to a region-specific version of the Deployment endpoint to meet this requirement:

RegionDeployment endpoint
United States - 1us.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
United States - 2us2.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
United States - 3us3.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
Canadaca.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
Europeeu.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
Japanap.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files
Australiaau.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files

Data collection requirements

To plan your Collector deployment, have the following information available for each server or virtual machine where you will install the Collector:

  • Display name
  • Network location
  • Server hostname and IP address
  • Administrator rights to install a service on the server

Endpoint data requirements

The collection of endpoint data also uses resources on the Collector. Endpoint data can be collected either by using the Collector to scan a range of endpoints or by installing a Rapid7 Insight Agent on the endpoints. Both methods will use resources on the Collector.

The greater the number of endpoints that the Collector needs to collect data from, the more resources it will need. If the CPU utilization is already consistently hovering at 40% or higher on the Collector, you should consider standing up another Collector at that location or adding more CPUs before adding additional endpoint ranges to scan or agents.

The Rapid7 Collector cannot have more endpoints or agents than 600 per CPU. Therefore, if your Collector has 4 CPU cores, it can handle up to 2,400 endpoints or agents if the CPU utilization is not already heavily utilized by event sources that have been added.

The number of event sources and the number of endpoints from which you are collecting data determine how much RAM and the number of CPUs the Collector needs. The more event sources and endpoints used for data collection, the more RAM and CPU the Collector will need to operate. The free disk space that the Collector has is used for the spillover of data collection only. Under normal circumstances, the Collector sends all data collected immediately to the cloud for processing.

However, if the collector loses connectivity to the cloud or it is under other subnormal operating scenarios, it will store collected data in a spillover folder on its hard drive. The more free disk space you give the collector, the more spillover space it will have available to it. It is often more efficient to deploy multiple collectors throughout the environment rather than break firewall rules or overload a single collector.

Also, when scanning endpoints with a collector, each collector can be configured with only one set of credentials for the endpoint scanning. If different credentials are required for scanning endpoints, then you will need to use a separate collector for each credential that will be used.

Collector placement and sizing

When considering where to place your Collectors, keep in mind that your bandwidth and network architecture will influence the number of Collectors that you need in your organization and where you should place them. Generally, you should deploy the Collectors close to the logs that will be pulled or sent and close to the endpoints that they will be scanning.

Rapid7 recommends a maximum of 80 event sources for each Collector, depending on the following:

  • The size of the event sources being added
  • The amount of CPU memory available to the Collector
  • The amount of VM resources available to the Collector
  • The amount of disk space available to the Collector

Event source distribution for collectors

The capacity of a collector depends on multiple factors. While the maximum recommended is 80 event sources for each collector, it can be more convenient to keep up to 50-60 event sources per collector to prevent data collection issues.

Distributing event sources over multiple collectors is always a good practice.

Collector Location Size

Number of Endpoints/Agents

Number of Event Sources on the Collector

Recommended Minimum CPU

Recommended Minimum RAM

Recommended Minimum Disk Space

Small

Up to 500

1 - 10

4

8 GB

60 GB

Medium

Up to 2,400

10 - 50

4

8 GB

80 GB

Large

Up to 600 per CPU core

50 - 80*

4+

16 GB

100 GB

*If you have more than 80 event sources, you should split your event sources across multiple collectors.

High-volume event sources place a higher RAM and CPU load on the collector and will result in the collector handling a lower number of event sources overall. Before adding a chatty event source like a firewall to the collector, check its current resource utilization (under Data Collection > Collectors).

  • If the CPU utilization is consistently more than 40%, consider adding another collector to the location to handle some of the event sources.
  • If the CPU utilization is consistently above 90%, then you need an additional collector to handle the load.

Important collector limitations

All collectors must be configured with a fully qualified domain name, for example, idrcollector1.myorg.com.

For endpoint scanning, a collector can be configured with only one endpoint scanning credential. Therefore, if you have multiple domains or other requirements for separate credentials that need to be used for scanning different endpoint ranges, you should plan on a separate collector for each domain or set of credentials.

If you want to collect logs from a Checkpoint firewall, you must use a collector running on Windows. You cannot use a Linux Collector to collect Checkpoint firewall logs.

A collector installed on Linux has a limitation to the number of agents that it can support due to default file descriptor settings. For most Linux systems, the default agent limit is 2000 agents. To increase the number of agents that can connect to a Linux Collector, change the number of file descriptors to be twice the number of agents that you want the collector to handle. More information on the file descriptor settings can be found here: https://www.tecmint.com/increase-set-open-file-limits-in-linux

If you already have Nexpose or InsightVM installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine as this will cause issues with your Nexpose systems.