Palo Alto Cortex Data Lake
Palo Alto Networks Cortex Data Lake (Cortex Data Lake) provides cloud-based log storage that uses artificial intelligence to analyze all your data at once. You can achieve a unified experience by configuring your Palo Alto products to send all data to Cortex Data Lake.
You can learn more about Cortex Data Lake by visiting the product website at: https://www.paloaltonetworks.com/cortex/cortex-data-lake.
The Cortex Data Lake event source allows InsightIDR to parse the following log types:
- Web Proxy
- Ingress Authentication
- VPN session
- Hostname to IP
- Advanced Malware
- Virus Infection
To set up Cortex Data Lake, you’ll need to:
- Review the requirements.
- Add the Palo Alto Cortex Data Lake event source in InsightIDR.
- Forward logs from Cortex Data Lake to NXLog to InsightIDR.
- Verify the event source configuration works.
Before you begin
There are different requirements depending on the collection method you use to send Palo Alto Cortex Data Lake events to InsightIDR. You must ensure the related requirements are met before beginning to set up.
New API Collection Method now available as of January 2023
We've released an API option for our Palo Alto Data Lake event source.
Phase 1: Review the requirements
Palo Alto requirements
- A valid license for a Palo Alto product that uses Cortex Data Lake.
- A Palo Alto user account with the permissions needed to configure Palo Alto products to send data to Palo Alto Networks Cortex Data Lake.
- Communication enabled between Cortex Data Lake and the host that will be running NXLog, which will be the syslog receiver. Follow the steps provided by Palo Alto to allow an inbound TLS feed to your NXLog host: https://docs.paloaltonetworks.com/cortex/cortex-data-lake/cortex-data-lake-getting-started/get-started-with-log-forwarding-app/forward-logs-from-logging-service-to-syslog-server
- Because the Insight Collector is on an internal network, you will need to open a port on your external firewall to allow the syslog traffic to flow from Cortex Data Lake to the Insight Collector. The common way to do this is with a network address translation (NAT). Use the NAT to map a public IP address to the Insight Collector’s private IP address, and create a rule that allows only your syslog traffic through your firewall to your Collector. Follow the documentation for your Palo Alto external firewall to create the NAT at: https://docs.paloaltonetworks.com/pan-os/9-1/pan-os-admin/networking/nat.
For information about Palo Alto products that use Cortex Data Lake and their requirements, see Palo Alto’s documentation at: https://docs.paloaltonetworks.com/cortex/cortex-data-lake/cortex-data-lake-getting-started/products-that-use-cortex-data-lake-container/products-that-use-cortex-data-lake
Cortex Data Lake API Requirements
These requirements are applicable when you select Cortex Data Lake API as your collection method:
To use the Cortex Data Lake API, you will need to generate an API key to provide to InsightIDR when setting up the event source. You will also need your Cortex Data Lake TenantOrganization ID and correct credentials.
Learn more about generating required credentials here: https://cortex.pan.dev/docs/data_lake/learn/developer_tokens
These requirements are applicable when you select Listen on Network Port as your collection method:
- Access to the host where you will install and configure NXLog. You may use your InsightIDR Collector as the host that will be running NXLog.
- If you have malware detection software installed on your NXLog host, be sure that the software will not block NXLog from operating.
- Ability to encrypt logs for transfer between Cortex Data Lake and NXLog.
- You must obtain a public certificate for this transfer. A list of Palo Alto trusted certificates can be found here: https://docs.paloaltonetworks.com/cortex/cortex-data-lake/cortex-data-lake-getting-started/get-started-with-log-forwarding-app/trusted-root-ca-log-forwarding-app
- NXLog requires a certificate file and a key file to make the connection to Cortex Data Lake.
- A public hostname that points to the public IP address of the NXLog server and matches the names associated with the certificate.
- If NXLog is running on the same host as your Collector, verify that any local Firewall services or malware detection software allows NXLog to send logs to your Collector. You can read more about these requirements in our NXLog documentation.
Phase 2: Add the Palo Alto Cortex Data Lake event source in InsightIDR
Would you like to send encrypted logs from NXLog to the Collector?
If you want to encrypt the logs from NXLog to the Insight Collector, select the Encrypted checkbox when setting up Cortex Data lake in InsightIDR, and download the Rapid7 certificate. You will need this certificate when configuring NXLog.
Set up Cortex Data Lake in InsightIDR
- From the left menu, select Data Collection.
- When the Data Collection page appears, click the Setup Event Source and choose Add Event Source.
- Click the Cloud Services category under Security Data.
- Choose your Collector.
- From the Event Source Type dropdown, choose the Palo Alto Networks Cortex Data Lake event source.
- If you are sending events other than alerts and want to view them in Log Search, select the Send unparsed logs checkbox.
- Select an attribution source.
- Select a collection method.
- If choosing the Listen on Network Port collection method, specify a port and a protocol.
- (Optional) Choose to Encrypt the event source if choosing TCP by downloading the Rapid7 Certificate.
- If choosing the Cortex Data Lake API collection method:
- Create a new credential. Enter a name for the credential in the Name field, and the Cortex Data Lake API key you have previously generated in the API Key field.
- Enter your Cortex Data Lake Organization ID in the Organization ID field.
- If choosing the Listen on Network Port collection method, specify a port and a protocol.
- Click Save.
Phase 3: Forward logs from Cortex Data Lake to NXLog to InsightIDR
Cortex Data Lake’s encryption requirements require you to set up a machine capable of receiving logs from Cortex Data Lake and forwarding them to the InsightIDR Collector. Rapid7 recommends using syslog by way of NXLog to receive logs from Cortex Data Lake.
Enable communication between Cortex Data Lake and your syslog receiver; follow the steps provided by Palo Alto to allow an inbound TLS feed to your NXLog host. To forward logs from Cortex Data Lake, follow Palo Alto’s documentation at: https://docs.paloaltonetworks.com/cortex/cortex-data-lake/cortex-data-lake-getting-started/get-started-with-log-forwarding-app/forward-logs-from-logging-service-to-syslog-server
To create a rule that allows only this event source’s traffic through your firewall to your designated host, use a network address translation (NAT) to map a public IP address to your Collector’s private IP address. Follow the documentation for your Palo Alto external firewall to create the NAT at: https://docs.paloaltonetworks.com/pan-os/9-1/pan-os-admin/networking/nat.
Key items to note when setting up log forwarding:
- You must configure a log filter to select the logs you want to forward or Cortex Data Lake will not forward any logs at all.
- Select all of the types of logs that you want to forward, and send them to the same event source in InsightIDR, which is the one you will be setting up for Cortex Data Lake events.
- For the syslog facility, you may send the logs in CSV (recommended), LEEF, or CEF format. If you are asked to configure a field delimiter, use the default delimiters for these facility types.
- InsightIDR does not require client authentication. However, you will need to configure the syslog profile to use the certificate file and key file that you previously obtained from Cortex Data Lake.
- You will not be able to test connectivity between Cortex Data Lake and NXLog until after you have configured NXLog and started the NXLog service.
- For the listening port, specify the port you will use to forward the syslog to NXLog. In the NXLog configuration, you will configure NXLog to listen on the same port.
Configure NXLog to collect encrypted syslog
There are 4 basic steps to configuring the collection of encrypted logs with NXLog.
- Install NXLog.
- Configure Cortex Data Lake to send encrypted logs to NXLog.
- Configure NXLog to decrypt logs and forward them to your Collector.
- Verify log collection and troubleshoot issues.
Task 1: Install NXLog
- Typically NXLog is installed on your Collector, but you can use a different host if you prefer.
- For Windows operating systems, follow our product documentation to install NXLog.
- For Linux operating systems, follow our blog to install NXLog.
Task 2: Configure Cortex Data Lake to send encrypted logs to NXLog
- Configure Cortex Data Lake to use a TCP port to send your encrypted logs.
- Port 6514 is the default port for secure syslog; this port can be used if it is not already in use on your NXLog host.
- Obtain a public certificate to encrypt/decrypt the logs from Cortex Data Lake. Follow Palo Alto’s documentation on trusted certificate issuers at: https://docs.paloaltonetworks.com/cortex/cortex-data-lake/cortex-data-lake-getting-started/get-started-with-log-forwarding-app/trusted-root-ca-log-forwarding-app
Task 3: Configure NXLog to decrypt logs and forward them to your Collector
- If the NXLog service is running on your NXLog host, stop the service before continuing.
- Replace the default
NXLog.conffile with the sample configuration file.
- Copy your certificates into a new folder on your NXLog host, or use the
\NXLog\certdirectory, to store your certificates.
- Edit these
NXLog.conffields to match your environment:
- Start the NXLog service.
For additional information on configuring NXLog, refer to our blog.
NXLog will forward the logs to the Collector using UDP by default. If you want to send the logs using encrypted syslog (because you selected Encrypted on the event source setup page and downloaded the Rapid7 certificate), you must use the
om_ssl module instead of
om_udp in the Output section of your
nxlog.conf file. Review our documentation on sending encrypted logs.
View a sample NXLog configuration file to ensure you have the correct Output section in your
Example output module for encrypted syslog
1<Output out>2Module om_ssl3#IP of the IDR Collector4Host 10.1.1.15#Port, must match what is configured for the event source6Port 65157#Exec to_json();8</Output>
Example output module for unencrypted (standard) syslog
1<Output out>2Module om_tcp3#IP of the IDR Collector4Host 10.1.1.15#Port, must match what is configured for the event source6Port 65157#Exec to_json();8</Output>
To obtain an entire
NXLog.conf file, see the sample NXLog configuration file.
Task 4: Verify collection and troubleshoot issues
- Check your NXLog.log file for errors. The audit log will be located where you specified in the
NXLog.conffile. If you can't find your
NXLog.logfile, open the
NXLog.conffile to find where the log is being created, which is indicated in the LogFile variable. The default location is: LogFile
- To determine if logs are flowing to InsightIDR, navigate to Data Collection -> Event Sources and find the Palo Alto Cortex Data Lake event source in the list. Select the View Raw Log button on the Palo Alto Cortex Data Lake event source to see if any logs are listed. If the View Raw Log page is empty, then the Collector has not received any logs.
For additional information on verifying collection and troubleshooting NXLog issues, including setting up and troubleshooting for a Linux environment, refer to our blog.
Phase 4: Verify the configuration
Complete the following steps to view your logs and ensure events are making it to the Collector.
- On your new Cortex Data Lake event source, click View Raw Log. If you see log messages in the box, this shows that logs are flowing to the Collector.
- In the left menu, click Log Search.
- Select the applicable log sets and, within the sets, select the log names. The log name will be the event source name or Palo Alto Networks Cortex Data Lake if you did not name the event source. Palo Alto Networks Cortex Data Lake logs flow into the Palo Alto Networks Cortex Data Lake log set.
Logs take a minimum of 7 minutes to appear in Log Search
If you see log messages when you select View Raw Log on the event source but do not see any log messages in Log Search after waiting for a few minutes for them to appear, then your logs do not match the recommended format and type for this event source.
Sample NXLog configuration
Review the NXLog reference manual about the configuration options at: https://docs.nxlog.co/userguide/configure/index.html.
Sample configuration file
Set the ROOT to the folder your NXLog was installed into, otherwise it will not start.
1define ROOT C:\Program Files\nxlog2define CERTDIR C:\Program Files\nxlog\cert3Moduledir %ROOT%\modules4CacheDir %ROOT%\data5Pidfile %ROOT%\data\nxlog.pid6SpoolDir %ROOT%\data7LogFile %ROOT%\data\nxlog.log89<Extension syslog>10Module xm_syslog11</Extension>12<Input ssl>13Module im_ssl14ListenAddr 0.0.0.0:1651415CAFile %CERTDIR%/datalake.cert16CertFile %CERTDIR%/plzwork.crt17CertKeyFile %CERTDIR%/plzwork.key18Exec parse_syslog();19</Input>20<Output udp_output>21Module om_udp22Host 127.0.0.123Port 1651524</Output>25<Route 1>26Path ssl => udp_output27</Route>
The sample log shown is in CSV format with a syslog header.
1<190>Feb 06 10:33:21 R7-5050 1,2020/02/06 10:33:21,0009C101184,THREAT,wildfire,0,B2020/02/06 10:33:21,18.104.22.168,22.214.171.124,0.0.0.0,0.0.0.0,lan-wan-allow-all,,,rapid7-base,vsys1,lan-unipa,wan,ethernet1/24.5,ethernet1/23,syslog-forward-profile,2020/02/06 10:33:21,34513570,1,58006,443,0,0,0xf000,tcp,alert,"rapid7.com/",(9999),social-networking,informational,client-to-server,4287654456,0x0,Italy,Ireland,0,,0,,,0,,,,,,,,0,0,0,0,0,vsys-unipa,PA-5050,,,,,0,,0,,N/A,unknown,AppThreat-0-0,0x0