Data Archiving

In SIEM (InsightIDR), the standard subscription package stores your log data for a retention period of 13 months. If you need to retain data for longer than that period, such as for security investigation or compliance purposes, it is recommended that you set up daily archiving.

Archiving allows you to retain a copy of your log data using the storage capabilities of Amazon S3.

After you configure this feature, all new data in SIEM (InsightIDR) is sent to the Amazon S3 bucket that you have specified.

Amazon S3 archiving is the only data archiving method that SIEM (InsightIDR) supports. There are two types of archiving available:

[Daily Archiving]
[Historical Archiving]

To learn more about data collection, storage, and retention, read our Data Storage and Retention FAQs.

⚠️

Once data has been archived, it cannot be re-ingested.

After log data has been sent to an Amazon S3 bucket, it cannot be re-ingested into Log Search. Ensure you are ready to archive your log data before doing so.

Daily Archiving

Once a day, SIEM (InsightIDR) sends all of the data from the previous day to an Amazon S3 bucket in a compressed file format: one BZIP or GZIP file is sent for each of the logs in SIEM (InsightIDR).

The data in the file has the same format as the data in the Log Search interface, which means that it’s parsed and attributed where applicable.

Because the Ingestion Time field is based on when log entries are ingested into Log Search, rather than when the events occurred in your environment, this field isn’t archived. Instead, the field that contains the log events’ event time, usually the timestamp field, is archived and can be referenced as the time when the event occurred.

To set up daily archiving:

[Configure an Amazon S3 bucket]
[Enable Encryption on the Amazon S3 bucket (Optional)]
[Configure the Archive in SIEM (InsightIDR)]

After you complete the configuration steps in this section, you can also perform historical archiving.

Configure an Amazon S3 Bucket

In your AWS account, create or identify the S3 bucket you want to use to store SIEM (InsightIDR) data.

To create an S3 bucket, follow the instructions here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html

ℹ️

Selecting a recommended region

If you are creating a new Amazon S3 bucket, we strongly recommend that it’s in the same region as your SIEM (InsightIDR) account — the data will be archived faster. This applies to both daily and historical archiving. Read the list of recommended and supported regions.

To grant SIEM (InsightIDR) access to the bucket:

Log in to the AWS Console.
From the top left menu, select Services > Storage > S3.
Do one of the following:
- To create a new bucket, click Create bucket.
- To use an existing bucket, select All Buckets and find the bucket you want to use.
Go to the Permissions tab and click Bucket Policy.
Click Edit.

Replace <bucket_name> in the following policy with the name of your S3 bucket, then paste the policy into the editor:


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::029723416200:role/learchiving"
      },
      "Action": [
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::<bucket_name>/*"
    }
  ]
}

Click Save changes.

ℹ️

Bucket naming rules

Amazon provides some guidance on how to name your bucket here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html

Enable Encryption on the Amazon S3 bucket (Optional)

Complete this step only if you want to enable server-side KMS encryption on your bucket.

To enable encryption, follow these steps:

Create a new KMS key:
- Log into AWS.
- Select Services > Security, Identity, & Compliance > Key Management Service > Customer-managed keys > Create a key.
- Under Key type, select Symmetric.
- Under Key usage, select Encrypt and decrypt.
- Under Advanced options, select KMS and Single-region key and click Next.
- On the Add labels screen, specify an alias and add any tags you would like to be attached to your KMS key, then click Next.
- On the Define key administrative permissions screen, specify the users or roles who are allowed to administer this key through the KMS API.
- Select or deselect the Key deletion checkbox and click Next.
- Optional: On the next screen, you can define key usage permissions and click Next.
- On the Review screen, in the Key policy window, add a new section to the existing Statement by copying and pasting the following code block, then click Finish:


{
            "Sid": "Allow use of the key by Rapid7",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::029723416200:role/learchiving"
            },
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:DescribeKey"
            ],
            "Resource": "*"
        }

Select Services > Storage > S3.
Select the S3 bucket you are using as the archive and go to the Properties tab.
Under Default encryption, click Edit.
Assign the newly created KMS key to your bucket:
- Under Server-side encryption, click Enable.
- Select AWS Key Management Service key (SSE-KMS).
- Select Choose from your AWS KMS keys and select the key you created from the dropdown list.
- Under Bucket Key, click Enable.
- Save your changes.

Configure the Archive in SIEM (InsightIDR)

After you configure the Amazon S3 bucket to receive data from SIEM (InsightIDR), you must configure SIEM (InsightIDR) to archive data to the bucket.

These factors can impact the time it takes to export the data:

Size of the export: The wider the time range you select, the longer it will take to process. For example, data that has been imported into SIEM (InsightIDR) over the course of a year will likely take several days to archive.
Region of the Amazon S3 bucket: If you are creating a new bucket, we strongly recommend it’s in a matching region — the data will be archived faster. This applies to both archiving options. Read the list of recommended and supported regions.

ℹ️

Tip for saving storage costs

This blog post from Amazon provides details about creating a lifecycle rule. We recommend you create one for both daily and historical archiving. This will help you to identify and delete incomplete uploads to help save you the cost of their storage: https://aws.amazon.com/blogs/aws-cloud-financial-management/discovering-and-deleting-incomplete-multipart-uploads-to-lower-amazon-s3-costs/ .

To configure the archive:

Log into SIEM (InsightIDR).
Select Settings on the left menu.
Select Log Management > Data Archiving.
To set up daily archiving, switch the toggle to Activate daily archiving.
Enter the name of the Amazon S3 bucket where you want to archive the log data.
Enter your Amazon AWS Account ID.
Select the compression type.
Click Save and Test. If SIEM (InsightIDR) can establish a connection, a success banner is displayed. If a connection cannot be established, contact Rapid7 Support and we can add the bucket to the account for you.

Historical Archiving

Historical archiving is useful when you haven’t set up daily archiving, and you want to archive all data at once–for example, if you are ending your SIEM (InsightIDR) subscription.

Before you begin

Before you can perform historical archiving, you must complete these prerequisite steps:

Configure an Amazon S3 bucket
Configure the Archive in SIEM (InsightIDR)

Note: You don’t need to activate daily archiving.

⚠️

Historical archiving limitations

The process of archiving large amounts of data can take several days to complete. For this reason, the process is limited to only twice a year. The date range for the archive must be 13 months or less.

To perform historical archiving, follow these steps:

Select Settings on the left menu.
Select Log Management > Data Archiving.
Under Historical Archiving, click Archive Now.
Enter a name for the archived data.
Specify the date range of the data.
Click Start Archiving.
You can check the progress of the archiving process on the Data Archiving tab.

Long Term Storage

If you are archiving your data strictly for compliance or auditing purposes and you don’t anticipate requiring regular access, you should consider setting up Amazon S3 Lifecycle Policies that move your data from Amazon S3 to a more cost-effective, long-term storage option, such as S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive.

Keep in mind that there is a delay involved in retrieving any data that is stored in S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive and there is a cost to making the data accessible.

For more information on Amazon S3 Lifecycle Policies, visit: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html

Supported Regions

Here are the recommended regions for archiving with Amazon S3:

Amazon S3 Region	URL
US_EAST_1 (preferred region for data stored in the United States)	https://s3.us-east-1.amazonaws.com
CA_CENTRAL (preferred region for data stored in Canada)	https://s3.ca-central-1.amazonaws.com
EU_IRELAND (preferred region for data stored in the EU)	https://s3.eu-west-1.amazonaws.com
AP_SYDNEY (preferred region for data stored in Australia)	https://s3.ap-southeast-2.amazonaws.com
AP_TOKYO (preferred region for data stored in Japan)	https://s3.ap-northeast-1.amazonaws.com

Here are the other supported regions:

Amazon S3 Region	URL
US_STANDARD	https://s3.amazonaws.com
US_WEST_OREGON	https://s3.us-west-2.amazonaws.com
US_EAST_OHIO	https://s3.us-east-2.amazonaws.com
US_WEST_N_CALIFORNIA	https://s3.us-west-1.amazonaws.com
EU_LONDON	https://s3.eu-west-2.amazonaws.com
EU_PARIS	https://s3.eu-west-3.amazonaws.com
EU_FRANKFURT	https://s3.eu-central-1.amazonaws.com
AP_MUMBAI	https://s3.ap-south-1.amazonaws.com
AP_SEOUL	https://s3.ap-northeast-2.amazonaws.com
AP_SINGAPORE	https://s3.ap-southeast-1.amazonaws.com
SA_SAO_PAULO	https://s3.sa-east-1.amazonaws.com