Data Collection Methods
When you configure event sources with the Collector, you can use one of the following data collection methods:
Listen on Network Port
You can configure your application to forward log events to a syslog server, and then configure the InsightIDR Collector to "listen" on network port for syslog data on a unique port in order to receive it.
See Syslog Logging for more information.
Log Aggregator
If you want to collect logs that have already been collected by a SIEM or a Log Aggregator, you can send raw logs to the Collector using a unique port. See SIEMs/Log Aggregators for more information.
SQS
AWS SQS, or Amazon Simple Queue Services, is a managed queuing service that works with InsightIDR when sending messages as events. See AWS SQS for more information,
WMI
WMI (Windows Management Instrumentation) allows your Collector to retrieve your event source applications for events that are related to User Attribution. WMI is available for all Windows-based event sources, and it is recommended for data collection whenever possible.
See Ports Used by InsightIDR for port recommendations and other requirements.
Watch Directory
You can monitor a network location that hosts log files that were copied from a specified directory. You can perform this monitoring on either a local host or remote host: this is called a watch directory.
After you set up the configuration, the Collector scans all of the files in the watch directory up to their end-of-file (EOF) position. The Collector then re-scans the directory at specified intervals. During the re-scans, the Collector consumes any net-new lines that were written after the previous EOF position.
Use this collection method for log files that roll over into new files, such as Microsoft DHCP, Microsoft DNS, or IIS log files used in OWA/ActiveSync.
The default option for the scan interval is 30 seconds, but that can be adjusted during configuration.
Watch Directory path options
During configuration, you must specify a path for the directory. Select from the following options for more information:
- Kerberos Authentication - Authenticates to a Windows UNC (Universal Naming Convention) path to a hosted network drive using the Kerberos protocol.
- Standard Authentication (NTLM) - Authenticate to a Windows UNC (Universal Naming Convention) path using a standard challenge-response protocol.
- Local Folder - Specify a local folder path or a Windows UNC (Universal Naming Convention) path to a hosted network drive. If the directory contains other files, enter a file pattern to specify which files InsightIDR should collect from the Directory.
Watch Directory specifications
- Watch Directory supports SMB v1 (CIFS), SMB v2, and SMB v3. InsightIDR requires packet signing for SMB2 connections.
- Each new line must be delimited by an
\n
newline character to be considered as a log entry. The limit for each line is 8192 characters. - The Watch Directory collection method requires read permissions to be granted to the service account for the target directory.
Watch Directory or Tail File?
You can use Watch Directory for monitoring multiple files. This collection method is best for log files that roll over into new files. Tail File is a method to monitor log files that are written continuously to a single file.
Tail File
You can configure InsightIDR to watch the network location where a host stores log data, and ingest any new data added to the log file on a local or remote host. Using the equivalent of the Unix tail command, InsightIDR will collect data written to the host disk every 20 seconds.
Tail File specifications
- Tail File will read a file and set the last read position. It will only upload net new log entries written to it since the event source started.
- Each new line must be delimited by a
\n
newline character to be considered as a line. The limit for this is 2048 characters. - During configuration, you must specify a local file path or a Windows UNC (Universal Naming Convention) path to a hosted network drive.
- Tail File supports SMB v1 (CIFS) and SMB v2. InsightIDR requires packet signing for SMB2 connections.
Tail File path options
During configuration, you must specify a path for the directory. Two options are available:
Local File Path
If the log files you want InsightIDR to collect are located on the Insight Collector, you can use the Local File Path option to collect them. In this case, specify the path to the local folder where the log files reside. If the folder contains other files that you do not want InsightIDR to collect, enter a file pattern to specify the files InsightIDR should collect from the folder.
Amazon S3
You can configure InsightIDR to read logs that are stored in an Amazon S3 bucket. For information about setting up an Amazon S3 bucket, visit the third-party vendor's documentation.
Amazon S3 Log Formatting
The Amazon S3 option supports reading files stored in an S3 bucket that contain newline-delimited plain text (.txt
, .log
), CSV (.csv
), and JSON (.json
). This option also supports reading supported file types when they have been compressed using gzip
(for example: filename.json.gz
or filename.csv.gz
). In either case, the files should have new line characters separating each log entry or event.
The maximum file size supported for a compressed .gzip
file is 10MB.
If the files contain log events spread across multiple lines, each line will be interpreted as a separate event. Amazon S3 does not support reading files with encrypted contents, binary data, or files that have been compressed using a method other than gzip
.
`.gzip` formatting
If you are using the gzip
format, the file metadata must contain the term gzip
, or alternatively the file extension must be set to .gz
.
Configure the Refresh Rate to your desired collection interval based on the type of logs being written to the S3 bucket. If you are not sure what to use, start with a setting of 5 minutes and adjust as needed.