User Attribution

In InsightIDR, attribution refers to the attempts the system makes to identify which assets, accounts, and users are involved in the collected log activity. For example, when an event log states that the activity was performed by the account jdoe, InsightIDR uses previously collected information from other event sources to determine whether that account is associated with the user Jane Doe or John Doe.

The InsightIDR attribution engine uses data collected by event sources, endpoint agents, and network sensors to model the relationship between IP addresses and assets over time. This model is used to attribute events that contain a source IP address (but not a source hostname or fully qualified domain name) to an asset. The term source address in InsightIDR refers to the information in an event log that identifies an asset. This might be an IP address, such as 10.101.102.103, or a hostname, such as janeslaptop or janeslaptop.bos.razor.com. There are 2 types of user attribution that InsightIDR performs:

  • Log line attribution - when a user or account name is present in a log associated with an event source, InsightIDR uses that information to attribute the log to a user. InsightIDR may use short account name, full account name, user first and last name, and account aliases from the log to calculate user attribution depending on what details are present in the log and the event source settings. This method is the most reliable attribution. If InsightIDR is unable to determine attribution from the log, primary user attribution is used instead.
  • Primary user attribution - If user data is not present in a given log, InsightIDR attributes logs to a user by looking up a primary user on the asset the log was attributed to. InsightIDR calculates the primary user of an asset using an algorithm that counts the number of log-ins per user over the past week. Whichever user has the most log-ins is determined to be the primary user. If there are multiple users with the most log-ins, InsightIDR assumes that the user with the most recent log-ins is the primary user. If there is a known primary user of that asset, that log entry is attributed to both the asset and the user.

User attribution algorithm information

User attribution is a best-effort algorithm that is not guaranteed to be accurate. To improve attribution rate and accuracy, you should setup LDAP (for user information) and DHCP (for asset information) event sources for your entire environment and tune user attribution settings on the event sources. If you don't setup at least one directory service, alerts that use directory services as event sources will not fire.

If the InsightIDR attribution engine cannot successfully determine attribution the asset, you must set a preference for which attribution source is most reliable for you. For example, if the log entry also contains a value in the username field other than the one that InsightIDR identified, for example, Jeremy Doe (jdoe@razor.com) for example, set a preference to identify the correct value. This preference is outlined on Advanced Event Source Settings.

Principals of User Attribution

Directory service event sources, such as LDAP and Azure Active Directory, serve to provide a source of truth on who the established users are within a network. After the event source is set up and all the users have been identified, InsightIDR works to attribute any new activity seen through Log Search back to those users. InsightIDR administrators can adjust the polling period at which the event source will query the directory service. The primary places InsightIDR users can see the output of user behavior analytics (UBA):

  • User Pages - searchable by the global search bar in InsightIDR or linked in various places throughout InsightIDR, user pages are a condensed way to view everything that InsightIDR knows about a certain user. User pages show any administrative groups a user is part of as well as all the accounts, network activity, and alerts or notable behaviors associated with the user.
  • Alerts - if an alert includes user behavior, that user will be tagged in the actors section of the alert payload. This can be accessed in the alert schema or in the actor details within Alert Details.
  • Log Search - when log events are ingested by InsightIDR, the native parsers normalize the log event into JSON. Where appropriate, InsightIDR also injects information into the new JSON log event. For example, if a firewall log comes in, InsightIDR recognizes the IP address of the asset and is able to correlate that using DHCP or other host-to-IP address methods to determine which asset owned that IP address at time of ingestion. Then, InsightIDR takes its knowledge of who is the primary user of that asset (the user with the most log-ins over the course of a month) and adds that user’s information into the log. For more details, visit Rapid7 Resource Names.

Duplicate users

Occasionally, you may encounter duplicate users in InsightIDR. There are a few scenarios that could create a duplicate user:

  • Certain Settings on a cloud service event source prevent attribution of an account to a directory user. As a result, a duplicate, observed user is created. For more information on directory and observed users, visit Users and Accounts.
  • A user is deleted from LDAP (instead of being explicitly disabled) and it is marked as removed from InsightIDR after 30 days. A cloud service event source then reports activity from an account, causingInsightIDR to reassign the account to a duplicate, observed user.
  • A misconfiguration in your environment creates a duplicate user. For example, if a user in LDAP has an email address like jsmith@rapid7.com but the same user has a different email address, john.smith@onmicrosoft.com in Azure Active Directory, then InsightIDR creates a duplicate user. This occurs because InsightIDR can't correlate the users (even though they represent the same person) if the account short names are different.

Configure user attribution settings

User attribution settings for all event sources can be configured from Settings > User Attribution.

Active Directory Domain

If you have multiple domains in your environment, it is important that you specify a default domain for your event sources. This setting ensures that InsightIDR knows which domain should be used to attribute users to, particularly when that data is not provided in the event log.

Deploy in Multi-Domain Environments

If you have more than one Active Directory in your environment, specify which domain is your default domain in order to more accurately detect users across domains and resolve any issues with user accounts.

For instance, if your company has DomainA and DomainB, but both domains have a user called John Smith, a default domain specifies which user the activity originated from. In this example, the default domain is DomainA. If InsightIDR receives data from John Smith that does not specify the domain, InsightIDR attributes data to John Smith from DomainA.

A default domain can be set for all event sources when configuring user attribution settings. If a different default domain is required for a specific event source, it can be set at an event source level.

If you do not configure a default domain, InsightIDR may incorrectly attribute user information.

Applicable Event Sources

You can configure default domains for the following event source categories:

You can also view and manage any custom default domain settings that have been applied at an event source level.

Account Attribution Preference

You can select an account attribution preference to apply across all of your event sources, going forward.

  • Use short name attribution: The system first attempts to attribute data by email address, for example, jsmith@myorg.example.com. If the first attempt is unsuccessful, attribution is attempted by short name, for example, jsmith. If the short name is unsuccessful, attribution is attempted by a user’s first and last name, for example, John Smith.
  • Use fully qualified domain name attribution: The system first attempts to attribute data by email address, for example, jsmith@myorg.example.com. If the first attempt is unsuccessful, attribution is attempted by a user’s first and last name, for example, John Smith. This option is best if your environment has collisions with short names.

You can also view and manage any custom account attribution settings that have been applied at an event source level.

Attribution Source

You can choose the most appropriate attribution source for your environment to be used by all event sources:

  • Use IDR engine if possible; if not, use event log - The user or asset the InsightIDR attribution engine identified is deemed responsible for the log activity. If the attribution engine is not able to identify a user, then the user or asset found in the log entry is deemed responsible.
  • Use event log if possible; if not, use IDR engine - The user or asset identified in the log entry is deemed responsible for the log activity. If the log entry does not contain a known user, use the InsightIDR attribution engine to identify the user or asset.
  • Use IDR engine only - The user or asset the InsightIDR attribution engine identified is deemed responsible for the log activity. Any user information that is found in the log entry will be ignored.
  • Use event log only - Only the user information that is found in the log entry will be used to determine the user or asset. Even if the log entry does not contain a known user, the InsightIDR attribution engine will not be consulted.

You can also view and manage any custom attribution source settings that have been applied at an event source level.