Data Archiving

In InsightIDR, the standard subscription package stores your log data for a retention period of 13 months. If you need to retain data for longer than that period, such as for security investigation or compliance purposes, it is recommended that you set up daily archiving.

Archiving allows you to retain a copy of your log data using the storage capabilities of Amazon S3.

After you configure this feature, all new data in InsightIDR is sent to the Amazon S3 bucket that you have specified.

Amazon S3 archiving is the only data archiving method that InsightIDR supports. There are two types of archiving available:

To learn more about data collection, storage, and retention, read our Data Storage and Retention FAQs.

Once data has been archived, it cannot be re-ingested.

After log data has been sent to an Amazon S3 bucket, it cannot be re-ingested into Log Search. Ensure you are ready to archive your log data before doing so.

Daily Archiving

Once a day, InsightIDR sends all of the data from the previous day to an Amazon S3 bucket in a compressed file format: one BZIP or GZIP file is sent for each of the logs in InsightIDR.

The data in the file has the same format as the data in the Log Search interface, which means that it's parsed and attributed where applicable.

Because the Ingestion Time field is based on when log entries are ingested into Log Search, rather than when the events occurred in your environment, this field isn’t archived. Instead, the field that contains the log events' event time, usually the timestamp field, is archived and can be referenced as the time when the event occurred.

To set up daily archiving:

  1. Configure an Amazon S3 bucket
  2. Enable Encryption on the Amazon S3 bucket (Optional)
  3. Configure the Archive in InsightIDR

After you complete the configuration steps in this section, you can also perform historical archiving.

Configure an Amazon S3 Bucket

In your AWS account, create or identify the S3 bucket you want to use to store InsightIDR data.

To create an S3 bucket, follow the instructions here: https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html

If you are creating a new Amazon S3 bucket, we strongly recommend that it's in the same region as your InsightIDR account -- the data will be archived faster. This applies to both daily and historical archiving. Read the list of recommended and supported regions.

To grant InsightIDR access to the bucket:

  1. Log into AWS.
  2. In the top left corner, select Services > Storage > S3.
  3. Either click Create Bucket or select the All Buckets view and search for the existing one you want to use.
  4. On the Permissions tab, click Access Control List.
  5. Click Add Account and paste the following Insight Platform account ID. Note: AWS may shorten the account name to 'archive': a9c2e4259cad99e03b67c7450d6cc9c0d4f3243363c80ca73f6b5152ff293bb0
  6. Grant these permissions for the account:
    • List Objects
    • Write Objects
    • Read bucket permissions
    • Write bucket permissions
  7. Click Save.

Bucket naming rules

Amazon provides some guidance on how to name your bucket here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html

Enable Encryption on the Amazon S3 bucket (Optional)

Complete this step only if you want to enable server-side KMS encryption on your bucket.

To enable encryption, follow these steps:

  1. Create a new KMS key:

    • Log into AWS.
    • Select Services > Security, Identity, & Compliance > Key Management Service > Customer-managed keys > Create a key.
    • Under Key type, select Symmetric.
    • Under Key usage, select Encrypt and decrypt.
    • Under Advanced options, select KMS and Single-region key and click Next.
    • On the Add labels screen, specify an alias and add any tags you would like to be attached to your KMS key, then click Next.
    • On the Define key administrative permissions screen, specify the users or roles who are allowed to administer this key through the KMS API.
    • Select or deselect the Key deletion checkbox and click Next.
    • Optional: On the next screen, you can define key usage permissions and click Next.
    • On the Review screen, in the Key policy window, add a new section to the existing Statement by copying and pasting the following code block, then click Finish:
json
1
{
2
"Sid": "Allow use of the key by Rapid7",
3
"Effect": "Allow",
4
"Principal": {
5
"AWS": "arn:aws:iam::029723416200:role/learchiving"
6
},
7
"Action": [
8
"kms:Encrypt",
9
"kms:Decrypt",
10
"kms:ReEncrypt*",
11
"kms:GenerateDataKey*",
12
"kms:DescribeKey"
13
],
14
"Resource": "*"
15
}
  1. Select Services > Storage > S3.
  2. Select the S3 bucket you are using as the archive and go to the Properties tab.
  3. Under Default encryption, click Edit.
  4. Assign the newly created KMS key to your bucket:
    • Under Server-side encryption, click Enable.
    • Select AWS Key Management Service key (SSE-KMS).
    • Select Choose from your AWS KMS keys and select the key you created from the dropdown list.
    • Under Bucket Key, click Enable.
    • Save your changes.

Configure the Archive in InsightIDR

After you configure the Amazon S3 bucket to receive data from InsightIDR, you must configure InsightIDR to archive data to the bucket.

These factors can impact the time it takes to export the data:

  • Size of the export: The wider the time range you select, the longer it will take to process. For example, data that has been imported into InsightIDR over the course of a year will likely take several days to archive.
  • Region of the Amazon S3 bucket: If you are creating a new bucket, we strongly recommend it’s in a matching region -- the data will be archived faster. This applies to both archiving options. Read the list of recommended and supported regions.

Tip for saving storage costs

This blog post from Amazon provides details about creating a lifecycle rule. We recommend you create one for both daily and historical archiving. This will help you to identify and delete incomplete uploads to help save you the cost of their storage: https://aws.amazon.com/blogs/aws-cloud-financial-management/discovering-and-deleting-incomplete-multipart-uploads-to-lower-amazon-s3-costs/.

To configure the archive:

  1. Log into InsightIDR.
  2. Select Settings on the left menu.
  3. Select Log Search > Data Archiving.
  4. To set up daily archiving, select Activate.
  5. Enter the name of the Amazon S3 bucket where you want to archive the log data.
  6. Click Save and Test. If InsightIDR can establish a connection, a success banner is displayed. If a connection cannot be established, please contact Rapid7 Support and we can add the bucket to the account for you.

Historical Archiving

Historical archiving is useful when you haven’t set up daily archiving, and you want to archive all data at once–for example, if you are ending your InsightIDR subscription.

Before you begin

Before you can perform historical archiving, you must complete these prerequisite steps:

Note: You don’t need to activate daily archiving.

Historical archiving limitations

The process of archiving large amounts of data can take several days to complete. For this reason, the process is limited to only twice a year. The date range for the archive must be 13 months or less.

To perform historical archiving, follow these steps:

  1. Select Settings on the left menu.
  2. Select Log Management > Data Archiving.
  3. Under Historical Archiving, click Archive Now.
  4. Enter a name for the archived data.
  5. Specify the date range of the data.
  6. Click Start Archiving.
  7. You can check the progress of the archiving process on the Data Archiving tab.

Long Term Storage

If you are archiving your data strictly for compliance or auditing purposes and you don't anticipate requiring regular access, you should consider setting up Amazon S3 Lifecycle Policies that move your data from Amazon S3 to a more cost-effective, long-term storage option, such as S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive.

Keep in mind that there is a delay involved in retrieving any data that is stored in S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive and there is a cost to making the data accessible.

For more information on Amazon S3 Lifecycle Policies, visit: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html

Supported Regions

Here are the recommended regions for archiving with Amazon S3:

Amazon S3 RegionURL
US_EAST_1 (preferred region for data stored in the United States)https://s3.us-east-1.amazonaws.com
CA_CENTRAL (preferred region for data stored in Canada)https://s3.ca-central-1.amazonaws.com
EU_IRELAND (preferred region for data stored in the EU)https://s3.eu-west-1.amazonaws.com
AP_SYDNEY (preferred region for data stored in Australia)https://s3.ap-southeast-2.amazonaws.com
AP_TOKYO (preferred region for data stored in Japan)https://s3.ap-northeast-1.amazonaws.com

Here are the other supported regions:

Amazon S3 RegionURL
US_STANDARDhttps://s3.amazonaws.com
US_WEST_OREGONhttps://s3.us-west-2.amazonaws.com
US_EAST_OHIOhttps://s3.us-east-2.amazonaws.com
US_WEST_N_CALIFORNIAhttps://s3.us-west-1.amazonaws.com
EU_LONDONhttps://s3.eu-west-2.amazonaws.com
EU_PARIShttps://s3.eu-west-3.amazonaws.com
EU_FRANKFURThttps://s3.eu-central-1.amazonaws.com
AP_MUMBAIhttps://s3.ap-south-1.amazonaws.com
AP_SEOULhttps://s3.ap-northeast-2.amazonaws.com
AP_SINGAPOREhttps://s3.ap-southeast-1.amazonaws.com
SA_SAO_PAULOhttps://s3.sa-east-1.amazonaws.com