Collector Requirements
The machine with the collector software installed acts as a server
Before you install a collector ensure that the machine with collector software is a server. Its intended use is collecting data for the Insight Platform and it should not be used for any other purpose.
To set up a collector, there are requirements that must be met. If you do not meet these requirements before attempting to set up a collector, it might not operate properly. Read all the sections and understand their importance to determine if deploying a collector is right for your organization.
If you already have Nexpose or InsightVM installed on your system, do not install the Insight Collector on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.
- System and host requirements
- Networking requirements
- Data collection requirements
- Endpoint data requirements
- Collector placement and sizing
- Important collector limitations
System and host requirements
You can install a Collector on a network server or virtual machine that meets the following requirements.
Minimum hardware:
- 4 CPU cores with 2GHz+ on each core
- 8 GB RAM recommended
- 60 GB+ available disk space
- Configured with a Fully Qualified Domain Name (FQDN) such as idrcollector23.myorg.com
Deploying the collector on ARM architecture, such as AWS Graviton, is not currently supported.
Read more about Collector Placement and Sizing.
Disk space
In some situations, a collector cannot establish a connection with the cloud and becomes unable to send data to the Insight Platform. Collector disk space allows it to temporarily store the data by writing logs to the disk until a connection is reestablished. If more disk space is available, your collector can store data for longer without a connection.
Because the Insight Platform compresses the data it receives, Rapid7 recommends 1GB of disk space for every 10GB of data in the collector. Additionally, plan for at least 24 hours of spillover disk space for each collector when data cannot reach the cloud.
Supported operating systems
Refer to these tables to view the OS versions that the Collector currently supports and the End-Of-Life (EOL) schedule for each.
Microsoft Windows Server
Name | EOL for Collector support |
---|---|
Windows Server 2022 | Oct 14, 2031 |
Windows Server 2019 | Jan 9, 2029 |
Windows Server 2016 | Jan 11, 2027 |
Linux
Distribution | Architecture | EOL for Collector support |
---|---|---|
Debian 11 | x86-64 | Jun 1, 2026 |
Debian 10 | x86-64 | Jun 1, 2024 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.1 | x86-64 | May 31, 2034 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 9.0 | x86-64 | May 31, 2034 |
Fedora 37 | x86-64 | Dec 12, 2023 |
Fedora 36 | x86-64 | May 16, 2023 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.1 | x86-64 | May 31, 2034 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux / Alma Linux 9.0 | x86-64 | May 31, 2034 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.7 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.6 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.5 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / Rocky Linux 8.4 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.3 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.2 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.1 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 8.0 | x86-64 | May 31, 2031 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 7.0-7.9 | x86-64 | Jun 30, 2025 |
Red Hat Enterprise Linux / Oracle Enterprise Linux / CentOS 6.0-6.10 | x86-64 | Jun 30, 2024 |
SUSE Linux Enterprise Server 15 | x86-64 | Jul 31, 2028 |
SUSE Linux Enterprise Server 12 | x86-64 | Oct 31, 2024 |
Ubuntu 22.04 (Standard Support) | x86-64 | Apr 2027 |
Ubuntu 22.04 (Pro Support) | x86-64 | Apr 2032 |
Ubuntu 20.04 | x86-64 | Apr 2, 2030 |
Ubuntu 18.04 (Pro Support) | x86-64 | Apr 2, 2028 |
Ubuntu 16.04 | x86-64 | Apr 2, 2026 |
Ubuntu 14.04 | x86-64 | Apr 2, 2024 |
End of life operating systems
Refer to these tables to view the OS versions that have reached end of life (EOL).
Microsoft Windows Server
Name | EOL for Collector support |
---|---|
Windows Server 2012 | Oct 10, 2023 |
Linux
Distribution | Architecture | EOL for Collector support |
---|---|---|
Amazon Linux 2 | x86-64 | Jun 30, 2023 |
Amazon Linux 1 | x86-64 | Jun 30, 2023 |
Fedora 37 | x86-64 | Dec 12, 2023 |
Fedora 36 | x86-64 | May 16, 2023 |
SUSE Linux Enterprise Desktop 15 SP3 | x86-64 | Jan 31, 2023 |
openSUSE LEAP 15.4 | x86-64 | Nov 30, 2023 |
Ubuntu 22.10 | x86-64 | May 2, 2023 |
PowerShell process for Windows OS
On Windows systems, the Collector must be capable of launching the PowerShell process locally in order to auto-configure event sources.
Supported browsers:
- Mozilla Firefox (latest stable release)
- Google Chrome (latest stable release)
Minimum network bandwidth:
- 100Mbps network (required)
- 1000Mbps (preferred)
Other recommendations:
- Only one Collector can be installed for each machine on your network. Rapid7 strongly recommends that the machine (physical or virtual) is dedicated to running the Collector.
Conflict with Insight Collector and pre-existing Nexpose software
If you already have Nexpose installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine, as this will cause issues with your Nexpose systems.
Networking requirements
As you prepare your network for the Collector, consider these areas:
SSL Decryption Exclusion
The Collector, as well as agents that use Collectors as a proxy to the Insight Platform, will not work if your organization decrypts SSL traffic using Deep Packet Inspection technologies such as transparent proxies.
Internal routing rules
The Collector polls and receives data from event sources, so you should provide the directory or file location where the collector can access the server logs for collecting log data. You can specify a local folder path or a Windows Universal Naming Convention (UNC) path to a hosted network drive.
Ports
A Collector requires each stream of syslog logs to be sent to it on a unique TCP or UDP port.
You will need to configure each device that will send logs using syslog to send the logs over a TCP or UDP port that is unique on that collector. It is common to start sending the logs using port 10000, although you may use any open unique port.
All Collectors must be able to reach out to port 443 and communicate back to the Collector via TCP ports:
Port Number | Data Gathered |
---|---|
TCP 5508 | Communication back to the Collector from the Insight Agent or Endpoint Monitor. |
TCP 6608 | Upgrade agent data path for the Insight Agent. |
TCP and UDP 8037 | File upload for Insight Agent. |
TCP 20,000 - 30,000 | Communication back to the Collector from the Endpoint Monitor. |
Linux collector requirements
For Linux collectors, you must use ports higher than 1024.
Read our product documentation on ports used by InsightIDR for more information.
IP ranges
Overlapping endpoint monitoring ranges are allowed. IP addresses or IP ranges defined on Collector A should not be duplicated on Collector B. If this exists, it should be updated before the migration. Otherwise, those ranges will have to be manually updated after the migration.
See IP Addresses for more information.
Credentials
Each collector can support only one set of endpoint monitoring credentials. Ensure you configure credentials for each collector instance on your network.
Firewall rules
Disable the local firewall on the collector host, if possible. For specific instructions, read our product documentation on firewall rules.
If you cannot disable the local firewall, follow these configurations.
All Collectors must be able to establish outbound connectivity on port 443
to *.endpoint.ingress.rapid7.com
and communicate with the domains shown in the Data and Storage (S3) columns of the following table according to your geographic region. For example, for InsightIDR subscribers that elect to store their data in Australia, Collectors must be able to communicate with the following endpoints using port 443
:
*.endpoint.ingress.rapid7.com
au.data.insight.rapid7.com
s3-ap-southeast-2.amazonaws.com
Region | Data endpoint | Storage (S3 endpoint) |
---|---|---|
United States - 1 | data.insight.rapid7.com | s3.amazonaws.com |
United States - 2 | us2.data.insight.rapid7.com | s3.us-east-2.amazonaws.com |
United States - 3 | us3.data.insight.rapid7.com | s3.us-west-2.amazonaws.com |
Canada | ca.data.insight.rapid7.com | s3.ca-central-1.amazonaws.com |
Europe | eu.data.insight.rapid7.com | s3.eu-central-1.amazonaws.com |
Japan | ap.data.insight.rapid7.com | s3-ap-northeast-1.amazonaws.com |
Australia | au.data.insight.rapid7.com | s3-ap-southeast-2.amazonaws.com |
If you intend to deploy token-based Insight Agents through your Collectors, you also need to allow outbound connectivity from each Collector on port 443
to the endpoint that provides the agent's configuration files. Just like the Data and Storage endpoints in the previous table, you can configure your firewall rules to allow your Collectors to connect to a region-specific version of the Deployment endpoint to meet this requirement:
Region | Deployment endpoint |
---|---|
United States - 1 | us.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
United States - 2 | us2.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
United States - 3 | us3.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
Canada | ca.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
Europe | eu.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
Japan | ap.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
Australia | au.deployment.endpoint.ingress.rapid7.com/api/v1/get_agent_files |
Data collection requirements
To plan your Collector deployment, have the following information available for each server or virtual machine where you will install the Collector:
- Display name
- Network location
- Server hostname and IP address
- Administrator rights to install a service on the server
Endpoint data requirements
The collection of endpoint data also uses resources on the Collector. Endpoint data can be collected either by using the Collector to scan a range of endpoints or by installing a Rapid7 Insight Agent on the endpoints. Both methods will use resources on the Collector.
The greater the number of endpoints that the Collector needs to collect data from, the more resources it will need. If the CPU utilization is already consistently hovering at 40% or higher on the Collector, you should consider standing up another Collector at that location or adding more CPUs before adding additional endpoint ranges to scan or agents.
The Rapid7 Collector cannot have more endpoints or agents than 600 per CPU. Therefore, if your Collector has 4 CPU cores, it can handle up to 2,400 endpoints or agents if the CPU utilization is not already heavily utilized by event sources that have been added.
The number of event sources and the number of endpoints from which you are collecting data determine how much RAM and the number of CPUs the Collector needs. The more event sources and endpoints used for data collection, the more RAM and CPU the Collector will need to operate. The free disk space that the Collector has is used for the spillover of data collection only. Under normal circumstances, the Collector sends all data collected immediately to the cloud for processing.
However, if the collector loses connectivity to the cloud or it is under other subnormal operating scenarios, it will store collected data in a spillover folder on its hard drive. The more free disk space you give the collector, the more spillover space it will have available to it. It is often more efficient to deploy multiple collectors throughout the environment rather than break firewall rules or overload a single collector.
Also, when scanning endpoints with a collector, each collector can be configured with only one set of credentials for the endpoint scanning. If different credentials are required for scanning endpoints, then you will need to use a separate collector for each credential that will be used.
Collector placement and sizing
When considering where to place your Collectors, keep in mind that your bandwidth and network architecture will influence the number of Collectors that you need in your organization and where you should place them. Generally, you should deploy the Collectors close to the logs that will be pulled or sent and close to the endpoints that they will be scanning.
Rapid7 recommends a maximum of 80 event sources for each Collector, depending on the following:
- The size of the event sources being added
- The amount of CPU memory available to the Collector
- The amount of VM resources available to the Collector
- The amount of disk space available to the Collector
Event source distribution for collectors
The capacity of a collector depends on multiple factors. While the maximum recommended is 80 event sources for each collector, it can be more convenient to keep up to 50-60 event sources per collector to prevent data collection issues.
Distributing event sources over multiple collectors is always a good practice.
Collector Location Size | Number of Endpoints/Agents | Number of Event Sources on the Collector | Recommended Minimum CPU | Recommended Minimum RAM | Recommended Minimum Disk Space |
---|---|---|---|---|---|
Small | Up to 500 | 1 - 10 | 4 | 8 GB | 60 GB |
Medium | Up to 2,400 | 10 - 50 | 4 | 8 GB | 80 GB |
Large | Up to 600 per CPU core | 50 - 80* | 4+ | 16 GB | 100 GB |
*If you have more than 80 event sources, you should split your event sources across multiple collectors.
High-volume event sources place a higher RAM and CPU load on the collector and will result in the collector handling a lower number of event sources overall. Before adding a chatty event source like a firewall to the collector, check its current resource utilization (under Data Collection > Collectors).
- If the CPU utilization is consistently more than 40%, consider adding another collector to the location to handle some of the event sources.
- If the CPU utilization is consistently above 90%, then you need an additional collector to handle the load.
Important collector limitations
All collectors must be configured with a fully qualified domain name, for example, idrcollector1.myorg.com
.
For endpoint scanning, a collector can be configured with only one endpoint scanning credential. Therefore, if you have multiple domains or other requirements for separate credentials that need to be used for scanning different endpoint ranges, you should plan on a separate collector for each domain or set of credentials.
If you want to collect logs from a Checkpoint firewall, you must use a collector running on Windows. You cannot use a Linux Collector to collect Checkpoint firewall logs.
A collector installed on Linux has a limitation to the number of agents that it can support due to default file descriptor settings. For most Linux systems, the default agent limit is 2000 agents. To increase the number of agents that can connect to a Linux Collector, change the number of file descriptors to be twice the number of agents that you want the collector to handle. More information on the file descriptor settings can be found here: https://www.tecmint.com/increase-set-open-file-limits-in-linux
If you already have Nexpose or InsightVM installed in your organization, do not install the Insight Collector software on an existing Nexpose Console or Nexpose Scan Engine as this will cause issues with your Nexpose systems.