Custom Logs
Copy link

⚠️

Custom Logs is an advanced feature

The Generic API Custom Logs event source is a powerful capability designed for users with advanced technical knowledge.

Improper use of custom event sources can cause data ingestion failures or exceed storage and processing limits. To avoid these issues, thoroughly validate your configuration before saving.

Like other raw data, custom logs contextualize information throughout SIEM (InsightIDR) and are helpful during log search. Any text-based log can be ingested through SIEM (InsightIDR). However, Rapid7 recommends using JSON or key-value pair (KVP) format to match the way data is presented in Log Search. Unstructured strings create unstructured log entries. If you don’t use a structured format, you can still search using plain text, but you won’t be able to filter or query by specific fields unless you apply custom parsing rules.

ℹ️

Collecting logs prior to event source setup

After you have turned on an event source, Rapid7 can collect some logs that were created prior to setup:

  • Cloud event source - custom logs created five minutes prior to setup
  • Collector event source - custom logs created 24 hours prior to setup

If an event source is paused, logs will only be collected for a maximum of 72 hours. For example, if the event source was paused for 24 hours and then turned on again, Rapid7 will collect the logs created 24 hours prior to unpausing the event source.

Configure SIEM (InsightIDR) to collect data from the event source
Copy link

After you complete the prerequisite steps and configure the event source to send data, you must add the event source in SIEM (InsightIDR).

To configure the new event source in SIEM (InsightIDR):

  1. From the left menu, go to Data Collection and click Setup Event Source > Add Event Source.
  2. Click Add Raw Data > Custom Logs.
  • Alternatively, you can search for Custom Logs or filter by the Rapid7 Product Type, and then select the Rapid7 Custom Logs event source tile.
  1. Follow the instructions below for the collection method of your choice:

Cloud
Copy link

Amazon S3

You can configure InsightIDR to read logs that are stored in an Amazon S3 bucket. The Amazon S3 collection method is best for low-volume logs. Avoid using high-throughput sources like AWS CloudTrail or VPC Flow Logs.

For information about setting up an Amazon S3 bucket, read the AWS documentation  on creating S3 buckets.

Maximum file size

This collection method supports compressed .gzip files up to 10MB. Files larger than 10MB will not be processed. Individual log lines must not exceed 1.4MB. If a single log line exceeds this limit, the integration will stop and return an error message.

Amazon S3 requirements
Copy link

To allow SIEM (InsightIDR) to receive data, you must configure the relevant Amazon account to provide access to its data:

  • The account must have an Access Key ID and Secret Access Key. View the third-party documentation  for instructions.
  • The account must have the following IAM permissions:
    • s3:ListBucket
    • s3:GetObject
ℹ️

Access Keys in AWS

In AWS, access keys consist of two parts: an access key ID, for example, AKIAIOSFODNN7EXAMPLE and a secret access key, for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY. You must use both the access key ID and secret access key together to authenticate your requests.

Amazon S3 log formatting
Copy link

The Amazon S3 option supports reading files stored in an S3 bucket that contain newline-delimited plain text (.txt, .log), CSV (.csv), and JSON (.json). This option also supports reading supported file types when they have been compressed using gzip (for example: filename.json.gz or filename.csv.gz). In either case, the files should have new line characters separating each log entry or event.

⚠️

CSV formatting

If your logs are contained in a .csv file, it is recommended to remove the header row prior to sending the logs to SIEM (InsightIDR).

If the files contain log events spread across multiple lines, each line will be interpreted as a separate event. Amazon S3 does not support reading files with encrypted contents, binary data, or files that have been compressed using a method other than gzip.

⚠️

`.gzip` formatting

If you are using the gzip format, the file metadata must contain the term gzip, or alternatively the file extension must be set to .gz.

Before configuring Amazon S3, ensure the data you’re ingesting is not from a high-volume source, such as AWS CloudTrail or VPC Flow Logs. Ingesting high-volume data may cause you to exceed your log ingestion limits.

To configure Amazon S3:

  1. Name your event source.
  2. Amazon S3 will be selected for the collection method. Specify your Amazon S3 Bucket Name.
  • Optionally, you can enter an Amazon S3 Key Prefix.
  1. Select an Amazon S3 Bucket from the dropdown or click Add a New Connection to create one.
  2. Enter the name of the Amazon S3 Bucket that you created, but remove the s3:// prefix. For example, if your bucket is s3://your.bucket.url, enter only your.bucket.url.
  3. Optionally, enter an Amazon S3 Key Prefix. A key prefix allows you to specify the folder the logs are stored in. For example, if your logs are stored in a folder named abcd1234, you would enter abcd1234/. If the logs are not stored in a folder, leave this field empty.
  4. Click Add a New Connection.
  5. In the Create a Cloud Connection screen, enter a name for the new connection.
  6. In the Bucket field, enter the name of a bucket you want to check for modified files.
  7. In the Region field, enter the region of the Amazon S3 bucket. For the precise format of this value, review the table of supported regions.
  8. In the AWS Access Key ID field, add a new credential:
  9. Name your credential.
  10. Describe your credential.
  11. Select the credential type.
  12. Enter the Access Key, which is the Access Key you obtained previously .
  13. Specify the product access for this credential.
  14. In the AWS Secret Access Key field, add a new credential:
  15. Name your credential.
  16. Describe your credential.
  17. Select the credential type.
  18. Enter the Secret Key, which is the Secret Key you obtained previously .
  19. Specify the product access for this credential.
  20. Click Save Connection.
  21. Click Save.

Generic API

⚠️

Generic API log formatting

The Generic API event source supports only JSON (.json) responses. Each response must not exceed 7,500 log lines. To avoid errors, ensure the supplied query parameters keep the results within this limit. If a response exceeds the size limit, the integration will stop and return an error message.

You can configure SIEM (InsightIDR) to collect data from any third party using the Generic API event source. This option allows you to ingest custom or unsupported data sources, making it easier to extract and operationalize the data that matters most to your organization.

Generic API requirements
Copy link

To allow SIEM (InsightIDR) to receive data from a third-party API, you must:

  • Provide a {start_time} in your query parameters using a gte (greater than or equal) filter to define the time range.
  • Select the corresponding time format for your event logs.
  • Specify the API field that contains the timestamp used in your query parameters.
⚠️

Generic API log collection limits

If a Generic API event source is paused, log collection resumes from 45 minutes before it was reactivated. Any logs generated before that 45-minute window will not be collected.

To configure Generic API:

  1. In the Add Event Source panel, collect using Cloud.

  2. Name your event source. This will become the name of the log that contains the event data in Log Search.

  3. In the Collection Method field, select Generic API.

  4. Click +Create Connection.

    a. In the Create Cloud Connection panel, enter a name for the new connection.

    b. In the Data URL field, enter the full URL of the third party responsible for sending data to SIEM (InsightIDR). For example: https://api.thirdpartyvendor.com/api/v1/datafeed/.

    c. In the Authentication Method field, select from Basic Authentication, Authorization Header, and OAuth 2.0.

    d. Fill in the remaining fields as necessary to allow data collection for your Generic API event source. Refer to third-party documentation as needed.

ℹ️

Authentication method requirements

The required credential fields depend on the authentication method selected in Step 4c and the API you’re using to collect data. Rapid7 substitutes values when you reference keys like {credential_1} or {client_id} in applicable configuration fields. These values are replaced at runtime with the corresponding credential values.

Each method requires a specific set of fields:

• Authorization Header
Requires: Authorization Format, Credential
Example: {'Authorization': 'SSWS {credential_1}'}

• Basic Authentication
Requires: Identifier, Secret
Example: jdoe and myp@ssw0rd

• OAuth 2.0
Requires: OAuth 2.0 Token URL, Client ID, Client Secret, OAuth 2.0 Payload, Token Key
Microsoft Security Example: {'grant_type': 'client_credentials', 'client_id': '{client_id}', 'client_secret': '{client_secret}', 'scope': 'https://graph.microsoft.com/.default'}
Workday Example: {'refresh_token': '{client_id}', 'grant_type': 'refresh_token', 'client_id': '123-456', 'client_secret': '{client_secret}'}

Ensure that the values you enter align with your API’s authentication requirements.

e. Click Save.

  1. In the Time Format field, select the format that matches the timestamp returned in the API response. Valid formats include ISO (for example, 2024-07-08T14:23:00Z) and EPOCH (for example, 1748874600).

  2. In the Time Field field, enter the name of the timestamp key in the API response.

    For example: if your API response includes "eventTime": "2024-07-08T14:23:00Z", enter eventTime.

  3. In the Query Parameters field, enter parameters for how the API should be queried. Use standard query string syntax supported by your API.

    SIEM (InsightIDR) dynamically updates the values provided in the request. For example, the query string $filter=createdDateTime+gt+{start_time}+and+createdDateTime+le+{end_time}&$top={limit}&$skip={offset} will be transformed into $filtercreatedDateTime+gt+2025-06-01T00:00:00Z+and+createdDateTime+le+2025-06-01T00:05:00Z&$top=100&$skip=200.

  4. In the Max Pagination Offset field, set the upper limit for the offset value SIEM (InsightIDR) can use in paginated API requests. This value must not exceed the maximum supported by the API’s pagination limits.

    Note: Set this value according to your API’s maximum supported offset (e.g., 9999) in the Query Parameters field (e.g., {'since': '{start_time}', 'until': '{end_time}', 'limit': '{limit}', 'offset': '{offset}'}). After the limit is reached, pagination stops—even if more data is available.

  5. In the Query Parameters Location field, define where query parameters should be added in the request. Most APIs expect query parameters in the request URL (e.g., ?limit=100). Only change this if your API requires parameters in a different location, such as the request body.

  6. Optionally, in the Response Keys field, specify each key you want SIEM (InsightIDR) to parse from the response data.

  7. Click Save.

Webhook
Copy link

SIEM (InsightIDR) allows data collection from products and systems that can send events through webhook requests (HTTP POST body method). A webhook event source can have one or more URLs associated with it. These unique URLs are used by the third party products as the destination of the webhook requests.

⚠️

Do not share unique URLs

The URLs associated with webhook event sources are unique and should be protected to prevent unauthorized users from sending data to the event source.

Webhook requirements
Copy link

SIEM (InsightIDR) supports the following data formats for webhook event sources:

ℹ️

Webhook data formatting

The type of data sent should be expressed by the Content-Type header used by the requests. If no header is included in the request, the system will attempt to determine whether the contents are JSON or plain text.

  • Plain text - Webhook requests containing plain text are processed with new line characters separating individual event lines.
  • JSON - Webhook requests containing JSON data are supported natively. If the contents are an array of events, the elements of the array are treated as individual events. Otherwise, the entire JSON request body will be treated as a single event.
  • For example, this JSON data will be interpreted as two events automatically:
[ { "log": "event 1"}, { "log": "event 2"} ]
  • If the events are contained within a field in the JSON object, the JSON Events Key field can be configured to indicate which field contains the events. In this example, the JSON Events Key is set to resources to identify that the contents should be read from that field within the JSON. Periods can be used to identify a nested field, for example path.to.events.
{ "type": "AuditLogEntry", "size": 2, "resources": [ { "log": "event 1" }, { "log": "event 2" } ] }
  • NDJSON - Webhook requests containing newline delimited JSON will have multiple JSON objects, separated by new line characters. Each line will be treated as individual events. The JSON Events Key may be specified if needed, as described in the JSON example.
  • URL encoded form values - Webhook requests containing form values will translate the form data’s key plus value pairs into a JSON presentation.

To configure webhook collection:

  1. Name your event source.
  2. Click Copy to copy the Webhook URL for use in the product you wish to configure to send events to SIEM (InsightIDR).
  • Click Generate a new Webhook URL if you want to add another URL. This can also be used if the existing URL needs to be replaced.
  1. Optionally, configure the JSON Events Key if needed.
  2. Click Save.

Test the configuration
Copy link

You can test if logs can be sent to the event source using a curl command. The following examples work for Windows, Mac and Linux operating systems. Ensure you replace the placeholder with your new Webhook URL:

curl --verbose <your-webhook-URL> --header "Content-type:application/json" --data "{\"message\":\"Something else happened\",\"user\":\"jsmith\",\"hostname\":\"server1\"}" curl --verbose <your-webhook-URL> --header "Content-type:application/text" --data "Raw text message"

Collector
Copy link

Listen on Network Port

You can configure your application to forward log events to a syslog server, and then configure the SIEM (InsightIDR) Collector to listen on a network port for syslog data on a unique port in order to receive it.

To configure Listen on Network Port:

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select Listen on Network Port.
  6. Follow the instructions to configure the Listen on Network Port collection method for your event source.
  • Optionally, choose to Encrypt the event source if choosing the TCP Protocol by downloading the Rapid7 Certificate.
  1. Click Save.

Log Aggregator

If you want to collect logs that have already been collected by a SIEM or a Log Aggregator, you can send raw logs to the Collector using a unique port.

To configure Log Aggregator:

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select Log Aggregator.
  6. Follow the instructions to configure the Log Aggregator collection method for your event source.
  • Optionally, choose to Encrypt the event source if choosing the TCP Protocol by downloading the Rapid7 Certificate.
  1. Click Save.

SQS Messages

AWS SQS, or Amazon Simple Queue Services, is a managed queuing service that works with SIEM (InsightIDR) when sending messages as events.

To configure SQS Messages:

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select SQS Messages.
  6. Follow the instructions to configure the SQS Messages collection method for your event source.
  7. Click Save.

Watch Directory

To configure Watch Directory:

You can monitor a network location that hosts log files copied from a specified directory on a local or remote host with Watch Directory.

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select Watch Directory.
  • Optionally, choose to Encrypt the event source if choosing the TCP Protocol by downloading the Rapid7 Certificate.
  1. Follow the instructions to configure the Watch Directory collection method for your event source.
  2. Click Save.

Tail File

You can configure SIEM (InsightIDR) to watch the network location where a host stores log data, and ingest any new data added to the log file on a local or remote host. Using the equivalent of the Unix tail command, SIEM (InsightIDR) will collect data written to the host disk every 20 seconds.

To configure Tail File:

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select Tail File.
  6. Follow the instructions to configure the Tail File collection method for your event source.
  7. Click Save.

Amazon S3

You can configure SIEM (InsightIDR) to read logs that are stored in an Amazon S3 bucket. For information about setting up an Amazon S3 bucket, visit the third-party vendor’s documentation .

To configure Amazon S3:

  1. Name your event source.
  2. Choose your collector from the dropdown list.
  3. Choose the timezone that matches the location of your event source logs.
  4. Optionally, select Parse RFC 3164 syslog headers to parse logs with syslog headers and format them in JSON. Do not select this option if you want to process those logs as unstructured raw data.
  5. Under Collection Method, select Amazon S3.
  6. Follow the instructions to configure the Amazon S3 collection method for your event source.
  7. Click Save.