User Attribution

In InsightIDR, attribution refers to the attempts the system makes to identify which assets, accounts, and users are involved in the collected log activity. For example, when an event log states that the activity was performed by the account jdoe, InsightIDR uses previously collected information from other event sources to determine whether that account is associated with the user Jane Doe or John Doe.

The InsightIDR attribution engine uses data collected by event sources, endpoint agents, and network sensors to model the relationship between IP addresses and assets over time. This model is used to attribute events that contain a source IP address (but not a source hostname or fully qualified domain name) to an asset. The term source address in InsightIDR refers to the information in an event log that identifies an asset. This might be an IP address, such as 10.101.102.103, or a hostname, such as janeslaptop or janeslaptop.bos.razor.com. There are 2 types of user attribution that InsightIDR performs:

  • Log line attribution - when a user or account name is present in a log associated with an event source, InsightIDR uses that information to attribute the log to a user. InsightIDR may use short account name, full account name, user first and last name, and account aliases from the log to calculate user attribution depending on what details are present in the log and the event source settings. This method is the most reliable attribution. If InsightIDR is unable to determine attribution from the log, primary user attribution is used instead.
  • Primary user attribution - If user data is not present in a given log, InsightIDR attributes logs to a user by looking up a primary user on the asset the log was attributed to. InsightIDR calculates the primary user of an asset using an algorithm that counts the number of log-ins per user over the past week. Whichever user has the most log-ins is determined to be the primary user. If there are multiple users with the most log-ins, InsightIDR assumes that the user with the most recent log-ins is the primary user. If there is a known primary user of that asset, that log entry is attributed to both the asset and the user.

User attribution algorithm information

User attribution is a best-effort algorithm that is not guaranteed to be accurate. To improve attribution rate and accuracy, you should setup LDAP (for user information) and DHCP (for asset information) event sources for your entire environment and tune user attribution settings on the event sources. If you don't setup at least one directory service, alerts that use directory services as event sources will not fire.

If the InsightIDR attribution engine cannot successfully determine attribution the asset, you must set a preference for which attribution source is most reliable for you. For example, if the log entry also contains a value in the username field other than the one that InsightIDR identified, for example, Jeremy Doe (jdoe@razor.com) for example, set a preference to identify the correct value. This preference is outlined on Advanced Event Source Settings.

Principals of User Attribution

Directory service event sources, such as LDAP and Azure Active Directory, serve to provide a source of truth on who the established users are within a network. After the event source is set up and all the users have been identified, InsightIDR works to attribute any new activity seen through Log Search back to those users. InsightIDR administrators can adjust the polling period at which the event source will query the directory service. The primary places InsightIDR users can see the output of user behavior analytics (UBA):

  • User Pages - searchable by the global search bar in InsightIDR or linked in various places throughout InsightIDR, user pages are a condensed way to view everything that InsightIDR knows about a certain user. User pages show any administrative groups a user is part of as well as all the accounts, network activity, and alerts or notable behaviors associated with the user.
  • Alerts - if an alert includes user behavior, that user will be tagged in the actors section of the alert payload. This can be accessed in the alert schema or in the actor details within Alert Details.
  • Log Search - when log events are ingested by InsightIDR, the native parsers normalize the log event into JSON. Where appropriate, InsightIDR also injects information into the new JSON log event. For example, if a firewall log comes in, InsightIDR recognizes the IP address of the asset and is able to correlate that using DHCP or other host-to-IP address methods to determine which asset owned that IP address at time of ingestion. Then, InsightIDR takes its knowledge of who is the primary user of that asset (the user with the most log-ins over the course of a month) and adds that user’s information into the log. For more details, visit Rapid7 Resource Names.

Duplicate users

Occasionally, you may encounter duplicate users in InsightIDR. There are a few scenarios that could create a duplicate user:

  • Certain Settings on a cloud service event source prevent attribution of an account to a directory user. As a result, a duplicate, observed user is created. For more information on directory and observed users, visit Users and Accounts.
  • A user is deleted from LDAP (instead of being explicitly disabled) and it is marked as removed from InsightIDR after 30 days. A cloud service event source then reports activity from an account, causingInsightIDR to reassign the account to a duplicate, observed user.
  • A misconfiguration in your environment creates a duplicate user. For example, if a user in LDAP has an email address like jsmith@rapid7.com but the same user has a different email address, john.smith@onmicrosoft.com in Azure Active Directory, then InsightIDR creates a duplicate user. This occurs because InsightIDR can't correlate the users (even though they represent the same person) if the account short names are different.