Set up and use sensitive data classifications
InsightCloudSec offers an integrated sensitive data discovery capability that can provide you with a unified approach to managing sensitive data discovery risks across your environments. This capability seamlessly combines Insights with the existing risk scoring and prioritization model introduced with Layered Context but found throughout InsightCloudSec. Currently, this capability supports sensitive data classification using resource tags but can leverage findings from third-party tools like Amazon Macie.
Feature support
InsightCloudSec currently supports sensitive data classifications for the following resource types from Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure commercial cloud accounts:
InsightCloudSec resource type | AWS equivalent | Azure equivalent | GCP equivalent |
---|---|---|---|
Cloud Dataset | N/A | N/A | BigQuery |
Database | N/A | SQL Database/Dedicated SQL Pool | Cloud SQL Database |
Storage Container | S3 Bucket | Blob Storage Container | Cloud Storage |
Storage Account | N/A | Storage Account | N/A |
Prerequisites
Before InsightCloudSec can begin reporting on sensitive data classification, ensure you have the following:
- InsightCloudSec Domain Admin permissions or the Sensitive Data Classification entitlement
- At least one supported cloud account connected to InsightCloudSec
Set up sensitive data classification
InsightCloudSec consumes a combination of finding metadata from CSPs and resource tagging to surface data classifications and instances of sensitive data in your environments. After InsightCloudSec begins processing the tags and classifications, the following Query Filters and Insight will report on the classification status for supported resources:
Resource with Sensitive Data Classifications
(Query Filter) - Checks if a resource has a sensitive data classification.Resource without Sensitive Data Classifications
(Query Filter) - Checks if a resource has sensitive data but no criteria (tagging).Resource With Missing Data Classification
(Insight) - Uses theResource without Sensitive Data Classifications
Query Filter to determine if a given resource has any data classification present. If no data classification is found, a finding is added.
If you use a data security service from a CSP, explore Cloud-based classification. Otherwise, review Manual classification for details on the tagging format.
Manual classification
While InsightCloudSec automatically parses sensitive data findings from CSPs, the resource tagging for manual classification must follow a specific key-value pair format. For a full list of values, review Data Classification (Settings). Rapid7 recommends leveraging continuous integration (CI) tools, Infrastructure-as-Code (IaC), and source control to ensure your deployment templates are automatically tagged appropriately and tracked over time. This is especially true if you do not use a CSP's data security service (like Amazon Macie) or wish to override the classification for a resource. If you have the appropriate InsightCloudSec and CSP permissions, you can add tags directly from the Resource Properties panel or you can add them using a CSP's console or API and InsightCloudSec will harvest them. You can also use the BotFactory to quickly scope resources and tag in the prescribed format.
Supported key-value pairs
The following table outlines the key-value pairs that are recognized as valid classifications. Review Data Classification (Settings) for more information on the full list of values. For assistance with auditing and tracking usage of these tags, we recommend you use the Tag Explorer.
Manual sensitivity overrides
Manually setting the sensitivity will override Cloud-based sensitivity settings.
Key | Description | Values |
---|---|---|
data_sensitivity | Overall sensitivity for the resource. If data_sensitivity is not false , it needs to be paired with at least one category of found sensitive data (for example: pii or phi ) | false , high , medium , low |
pii | Data is classified as personally identifiable information (PII). If pii is not any , it has a default sensitivity so it does not need to be paired with data_sensitivity . If pii is any , it needs to be paired with a data_sensitivity severity (for example: high , medium , low ) | any or anything PII-related, such as ip_address , marital_status , national_identification_number |
phi | Data is classified as protected health information (PHI). If phi is not any , it has a default sensitivity so it does not need to be paired with data_sensitivity . If phi is any , it needs to be paired with a data_sensitivity severity (for example: high , medium , low ) | any or anything PHI-related, such as blood_type , fda_code , health_insurance_number |
credential | Data is classified as a credential. For example, an access token or password. If credential is not any , it has a default sensitivity so it does not need to be paired with data_sensitivity . If credential is any , it needs to be paired with a data_sensitivity severity (for example: high , medium , low ) | any or anything credential-related, such as json_web_token , password , openssh_private_key |
financial | Data is classified as financial. For example, a credit card or bank account number. If financial is not any , it has a default sensitivity so it does not need to be paired with data_sensitivity . If financial is any , it needs to be paired with a data_sensitivity severity (for example: high , medium , low ) | any or anything financial-related, such as bank_account_number , credit_card_expiration , iban_number |
Want to audit and track your sensitive data tags?
Example classifications
Valid:
data_sensitivity: false
- Resource will be marked as not having sensitive data.
pii: blood_type
- Resource will be marked as having sensitive blood type data with a severity determined by InsightCloudSec.
data_sensitivity: high
,pii: any
- Resource will be marked as having sensitive PII data with a high severity.
data_sensitivity: medium
,pii: any
,phi: blood_type-fda_code
- Resource will be marked as having sensitive blood type and FDA code data with a severity determined by InsightCloudSec. Multiple sub-types are properly delimited with a hyphen.
Invalid:
data_sensitivity: high
- Invalid because InsightCloudSec cannot infer the category or type of sensitive data. This classification will result in an Insight finding.
pii: any
- Invalid because InsightCloudSec cannot infer a severity from
any
. This classification will result in an Insight finding.
- Invalid because InsightCloudSec cannot infer a severity from
data_sensitivity: high
,credentials: creit_card
- Invalid because InsightCloudSec cannot recognize the type (
credit_card
is misspelled). This classification will result in an Insight finding.
- Invalid because InsightCloudSec cannot recognize the type (
pii: ssn,marital_status
- Invalid because multiple sub-types of the same category should be hyphen delimited (for example:
pii: ssn-marital_status
).
- Invalid because multiple sub-types of the same category should be hyphen delimited (for example:
Cloud-based classification
If you currently use a CSP's data security service, InsightCloudSec can visualize and assess the risk of your classifications with no manual intervention. Explore the following sections for details on how InsightCloudSec reports on each CSP's service and classifications.
AWS
AWS data sensitivity classifications require the Amazon Macie service. For more information on how Amazon Macie finds and determines data sensitivity, explore Amazon's Macie documentation: https://docs.aws.amazon.com/macie/latest/user/data-classification.html. InsightCloudSec uses the following Query Filter and Insight to help you audit which accounts do not have Macie turned on:
Cloud Account Without Macie Enabled (AWS)
(Query Filter) - Checks if the Macie service is turned on for a given cloud account in the selected regions.Cloud Account without Macie Enabled (AWS)
(Insight) - Uses theCloud Account Without Macie Enabled (AWS)
Query Filter to determine if Macie is turned on for a given cloud account (in any region). If Macie is not turned on for any region inside the cloud account, a finding is added.
If the Insight is adding findings, follow the Recommended Remediation Steps found in the Insight Details, which is accessed by opening the Insight from the Insights Library.
After ensuring Macie is turned on for all relevant accounts, InsightCloudSec will report on findings related to sensitive data and surface the classification category, type, sensitivity, and number of sensitive objects per resource. Currently, InsightCloudSec only supports automated sensitive data discovery findings from Macie. Explore the Amazon Macie documentation for more information on automated sensitive data discovery: https://docs.aws.amazon.com/macie/latest/user/discovery-asdd.html
AWS Macie findings also supported as a resource
InsightCloudSec supports harvesting AWS Macie data findings as a resource type in the inventory.
Azure
Azure data sensitivity classifications require the Microsoft Defender for Cloud service (Defender CSPM or Defender for Storage plans) with Sensitive Data Discovery turned on. For more information on how Microsoft Defender for Cloud finds and determines data sensitivity, explore Azure Defender for Cloud documentation: https://learn.microsoft.com/en-us/azure/defender-for-cloud/concept-data-security-posture#sensitive-data-discovery. InsightCloudSec uses the following Query Filter and Insights to help you audit which accounts do not have Sensitive Data Discovery turned on:
Cloud Account Sensitive Data Discovery Status
(Query Filter) - Checks if the Defender CSPM and Defender for Storage plans have Sensitive Data Discovery turned on.Cloud Account without Defender CSPM Sensitive Data Discovery Enabled
(Insight) - Uses theCloud Account Sensitive Data Discovery Status
Query Filter to determine if the cloud account or tenant has Defender CSPM Sensitive Data Discovery turned on. If Defender CSPM Sensitive Data Discovery is not turned on inside the cloud account or organization, a finding is added.Cloud Account without Defender for Storage Sensitive Data Discovery Enabled
(Insight) - Uses theCloud Account Sensitive Data Discovery Status
Query Filter to determine if the cloud account or tenant has Defender for Storage Sensitive Data Discovery turned on. If Defender for Storage Sensitive Data Discovery is not turned on inside the cloud account or organization, a finding is added.
If the Insights are adding findings, follow the Recommended Remediation Steps found in the Insight Details, which is accessed by opening the Insight from the Insights Library.
After ensuring Sensitive Data Discovery is turned on for all relevant accounts, InsightCloudSec will report on findings related to sensitive data and surface the classification category, type, and sensitivity per resource.
GCP
GCP data sensitivity classifications require the Sensitive Data Protection (formerly known as Data Loss Prevention) service. For more information on how GCP Sensitive Data Protection finds and determines data sensitivity, explore GCP's documentation: https://cloud.google.com/sensitive-data-protection/docs. InsightCloudSec uses the following Query Filter and Insight to help you audit which accounts do not have Sensitive Data Protection turned on:
Cloud Account Without Sensitive Data Protection Enabled
(Query Filter) - Checks if the Sensitive Data Protection service is turned on for a given cloud account or organization.Cloud Account Without Sensitive Data Protection in Use
(Insight) - Uses theCloud Account Without Sensitive Data Protection Enabled
Query Filter to determine if the Sensitive Data Protection service is turned on for a given cloud account or organization. If Sensitive Data Protection is not turned on inside the cloud account or organization, a finding is added.
If the Insight is adding findings, follow the Recommended Remediation Steps found in the Insight Details, which is accessed by opening the Insight from the Insights Library.
After ensuring Sensitive Data Protection is turned on for all relevant accounts, InsightCloudSec will report on findings related to sensitive data and surface the classification category, type, and sensitivity per resource.
Interact with sensitive data classifications
Now that your sensitive data classifications are represented in InsightCloudSec, you can begin to interact with the available reporting throughout InsightCloudSec:
Infrastructure as Code (IaC)
The IaC feature in InsightCloudSec can be used to validate data classifications and prevent deployments for Terraform and CloudFormation templates with missing data classifications. To do this, you'll need to:
- Create a Custom Insight Pack featuring the
Resource With Missing Data Classification
Insight. - Update your deployment templates for the prescribed tagging format.
- Create a IaC scan configuration using the Custom Insight Pack.
- Integrate
mimics
(IaC scanning tool) with your CI/CD pipeline.
To create a Custom Insight Pack:
- Log in to InsightCloudSec.
- Go to Security > Insights > Custom Packs.
- Click + Create Custom Pack.
- Enter a Pack Name and Description.
- Click OK.
- Optionally, set up a subscription for the pack.
- Go to Security > Insights > Library.
- Search for the
Resource With Missing Data Classification
Insight. - Select the checkbox next to it and click Add to Custom Pack.
- Search for the new Custom Pack you created and select it.
- Click OK.
To create an IaC Scan Configuration:
- Go to Security > Infrastructure as Code > Configurations.
- Click + New Configuration.
- Enter a Name.
- Optionally, enter a Description.
- On the Insight Settings tab, search for and select the Custom Pack you just created.
- Optionally, on the Notifications tab, update the Slack and Email notification fields.
- Click Apply.
With a Custom Insight Pack and Scan Configuration configured, you can then begin automatically scanning templates for compliance. For more details on mimics
and using IaC in InsightCloudSec, explore the documentation. The following image shows an example of mimics
scanning templates:
Layered Context and Risk
Layered Context is one of the primary reporting mechanisms for data sensitivity classifications and provides the quickest way to view the extent of the sensitivity for a given resource. Data sensitivity also can affect the risk score for a resource. If you open Layered Context, you'll see the Sensitive Data column displaying one of a few possible statuses:
- Sensitive - You or the related cloud service have determined the data for this resource is sensitive.
- Not Sensitive - You have determined the data for this resource is not sensitive.
- Not Classified - InsightCloudSec has not identified any data classification for this resource from any of the CSP services or classification through tagging.
- N/A - The resource it is not supported by data classification.
If a resource has been marked as sensitive, you can hover your cursor to display a pop-up summary of the data found on the resource. Click Sensitive to open the Resource Properties panel directly to the Sensitive Data tab.
To filter Layered Context based on sensitivity:
- Log in to InsightCloudSec.
- Go to Security > Layered Context.
- Click Add Filter.
- Click Clear All Filters.
- Click Add Filter.
- Configure a filter:
Data Classification is Sensitive
. - Click Apply.
Resource Properties
Supported resources have access to a Sensitive Data tab in the Resource Properties panel. There are two sub-tabs on the Sensitive Data tab:
- Data Overview - An overview of the classification type, count, sensitivity, and source of data found on the resource. The classifications are processed and presented by InsightCloudSec, but the data can come from a CSP or manual classification (the source will be Rapid7 in this case).
- Data Findings - A list of the data findings from CSP services associated with the resource. This tab currently only supports displaying Amazon Macie findings.
The Resource Properties panel can be accessed throughout InsightCloudSec, including from the following locations:
- Inventory > Resources
- Security > Layered Context
- Security > Attack Paths
If a resource is deemed sensitive, InsightCloudSec exposes the sensitivity summary information and automatically generates an overall sensitivity (called Resource data sensitivity) for the resource that matches the highest severity sensitivity found. From this tab, you can also view the Insight findings related to data sensitivity for the selected resource. For additional properties details and actions, explore Resources.
Want to override the sensitivity?
You can override the sensitivity rating using manual classification.
Tag Explorer
As recommended in the Manual Classification section, you may want to leverage the Tag Explorer to audit data classification tag usage throughout your environments. Review the Tag Explorer documentation for details on creating a tag configuration. The following are some example tag configurations:
- Audit data sensitivity tag usage in general:
- Tag keys:
data_sensitivity
- Options: Missing all tags
- Resource types: Storage Container, Dataset, Database, Storage Account
- Tag keys:
- Audit financial databases:
- Tag keys:
data_sensitivity
,financial
- Options: Contains all tags
- Resource types: Database
- Tag keys:
- Audit storage buckets for PII:
- Tag keys:
data_sensitivity
,pii
- Options: Contains all tags
- Resource types: Storage Container
- Tag keys:
Attack Paths
Resources with sensitive data that are on Attack Paths are especially problematic and should be mitigated as soon as possible. The following data sensitivity-related Attack Paths are available:
Attack Path Name | Supported CSPs | Description |
---|---|---|
Publicly Exposed Compute Instance with access to Cloud Trail Data | AWS | An attacker who gains access to the instance can access and manipulate/steal sensitive information, gain access to other cloud resources or disrupt business operations. It can also be used to further pivot within the customer's cloud footprint due to exposing data about additional cloud accounts and resources. |
Publicly Exposed Compute Instance with access to a Bucket Containing PII Data | AWS | When a compute instance has access to PII data stored in an S3 bucket, it can read and potentially manipulate this data thereby posing significant security risks. |
For more information on using Attack Paths, explore the documentation.
Cloud Summary (Risk Overview)
For a high-level overview of your environment's risk, including publicly-available sensitive data, you'll want to continually review the Cloud Summary > Risk Overview. For details on using the Cloud Summary, explore the documentation.
Data Classifications (Settings)
For detailed information on which external sensitive data classifications InsightCloudSec processes, go to Settings > Data Classification. This page assists with exploring the types of data and clouds that InsightCloudSec supports.