Troubleshooting

This section provides descriptions of problems commonly encountered when using the application and guidance for dealing with them. If you do need to contact Technical Support, this section will help you gather the information that Support needs to assist you.

Working with log files

If you are encountering problems with the Security Console or Scan Engine, you may find it helpful to consult log files for troubleshooting. Log files can also be useful for routine maintenance and debugging purposes.

The section does not cover the scan log, which is related to scan events. See Viewing the scan log.

Locating each log file and understanding its purpose

Log files are located in /nsc/logs of the directory you installed on the Security Console and /nse/logs of the directory you installed on Scan Engines. The following log files are available:

  • access.log (on the Security Console only): This file captures information about resources that are being accessed, such as pages in the Web interface. At the INFO level, access.log captures useful information about API events, such as APIs that are being called, the API version, and the IP address of the API client. This is useful for monitoring API use and troubleshooting API issues. The file was called access_log in earlier product versions.
  • auth.log (on the Security Console only): This file captures each logon or logoff as well as authentication events, such as authentication failures and lockouts. It is useful for tracking user sessions. This file was called um_log in earlier product versions.
  • nsc.log (on the Security Console only): This file captures system- and application-level events in the Security Console. It is useful for tracking and troubleshooting various issues associated with updates, scheduling of operations, or communication issues with distributed Scan Engines. Also, if the Security Console goes into Maintenance Mode, you can log on as a global administrator and use the file to monitor Maintenance Mode activity.
  • nse.log (on the Security Console and distributed Scan Engines): This file is useful for troubleshooting certain issues related to vulnerability checks. For example, if a check produces an unexpected result, you can look at the nse.log file to determine how the scan target was fingerprinted. On distributed Scan Engines only, this file also captures system- and application-level events not recorded in any of the other log files.
  • mem.log (on the Security Console and distributed Scan Engines): This file captures events related to memory use. It is useful for troubleshooting problems with memory-intensive operations, such as scanning and reporting.
  • bootstrap.*.log: This file contains relevant logs for the installation and registration of the collector.
  • collector.log*: This file contains relevant logs for general collector errors.
  • engine_communication.log: This file contains relevant logs for communication between the Security Console and remote scan engines.
  • eso.log: This file contains relevant logs for Automated Actions.
  • initdb.log: This file contains relevant logs for the database.
  • audit.log: This file records Security Console user creation events, deletion events, role changes, and site configuration changes.

In earlier product versions, API information was stored in nsc.log.

Structure and contents of log files

Log files have the following format:

1
[yyyy-mm-ddThh:mm:ss GMT] [LEVEL] [Thread: NAME] [MESSAGE]

Example:

1
2011-12-20T16:54:48 [INFO] [Thread: Security Console] Security Console started in 12 minutes 54 seconds

The date and time correspond to the occurrence of the event that generates the message.

Every log message has a severity level:

Level

Meaning

Example

ERROR

an abnormal event that prevents successful execution of system processes and can prevent user operations, such as scanning

the Security Console’s failure to connect to the database

WARN

an abnormal event that prevents successful execution of system processes but does not completely prevent a user operation, such as scanning

disruption in communication between the Security Console and a remote Scan Engine

INFO

a normal, expected event that is noteworthy for providing useful information about system activity

the Security Console’s attempts to establish a connection with a remote Scan Engine

DEBUG

a normal, expected event that need not be viewed except for debugging purposes

the execution of operations within the Security Console/Scan Engine protocol

When reading through a log file to troubleshoot major issues, you may find it useful look for ERROR- and WARN-level messages initially. Thread identifies the process that generated the message.

Configuring which log severity levels are displayed

By default, all log files display messages with severity levels of INFO and higher. This means that they display INFO, WARN, ERROR messages and do not display DEBUG messages. You can change which severity levels are displayed in the log files. For example, you might want to filter out all messages except for those with WARN and ERROR severity levels. Or, you may want to include DEBUG messages for maintenance and debugging purposes.

Using log commands

The Security Console and the embedded Scan Engine log levels can be controlled using console commands. For steps to configure log levels for a distributed Scan Engine, see the Using the file system section of this article. For more information on console commands, see Using the command console.

Command

Description

Examples

log list

List all logging configuration properties

log list

log set [<name>] [value]

Set a logging configuration property to a specified value. Omit the name parameter to set all properties to the specified value.
Use log list to view available property names

log set DEBUG
log set auth-level debug

log reset [<name>]

Reset a logging configuration property to its default value. Omit the name parameter to reset all properties to their default value.
Use log list to view available property names.

log reset
log reset auth-level

After issuing a command, the change is applied after approximately 30 seconds.

Using the file system

Configuration steps are identical for the Security Console and distributed Scan Engines. To configure which log severity levels are displayed, take the following steps:

In the user-log-settings.xml file, default refers to the nsc.log file or nse.log file, depending on whether the installed component is the Security Console or a distributed Scan Engine.

  1. In a text editor, open the user-log-settings.xml file, which is located in the [installation_directory]/nsc/conf directory.
  2. Un-comment the following line by removing the opening and closing comment tags: <!-- and -->: <!-- <property name="default-level" value="INFO"/> -->
  3. If you want to change the logging level for the nsc.log (for Security Console installations) or nse.log file (for Scan Engine installations), leave the value default unchanged. Otherwise, change the value to one of the following to specify a different log file:
    • auth
    • access
    • mem
  4. Change the value in the line to your preferred severity level: DEBUG, INFO, WARN, or ERROR. Example: <property name="default-level" value="DEBUG"/>
  5. To change log levels for additional log files, simply copy and paste the un-commented line, changing the values accordingly. Examples:
1
<property name="default-level" value="DEBUG"/>
2
<property name="auth-level" value="DEBUG"/>
3
<property name="access-level" value="DEBUG"/>
4
<property name="mem-level" value="DEBUG"/>
  1. Save and close the file.

The change is applied after approximately 30 seconds.

Send diagnostic logs to Rapid7 Support

Diagnostic logs generated by the Security Console and Scan Engines can be sent to Rapid7 Support via the diagnostics page:

  1. In your Security Console, navigate to the Administration page.
  2. Under the Maintenance, Storage and Troubleshooting section, click Diagnose.
  3. Check the desired diagnostics boxes.
  4. Click Send Logs.

Insight Support Application

The Insight Support Application is an optional remote log retrieval feature that allows Support to access your log files. With the Insight Support Application, you do not need to attach logs to support cases or send logs manually.

To opt-in:

  1. Go to Administration > Console > Updates.
  2. Check the box to allow Rapid7 to remotely build and transmit support packages.

Send logs via a proxy server

If the Security Console does not have direct internet access, you can use a proxy server for sending logs to Rapid7 Support.

To configure proxy settings for sending logs:

  1. In your Security Console, navigate to the Administration page, and click Console > Proxy Settings.
  2. Enter the proxy server information in the appropriate fields.
    • The Name or address field refers to the fully qualified domain name or IP address of the proxy server.
    • The Port field is the port number on the proxy server that the Security Console connects to when sending log files.
    • The Security Console uses the information in the Domain, User ID, and Password fields to be authenticated on the proxy server.
  3. Check the Send support logs box.
  4. Click Save.

Troubleshooting scan accuracy issues with logs

If your scans are producing inaccurate results, such as false positives, false negatives, or incorrect fingerprints, you can use a scan logging feature to collect data that could help the Technical Support team troubleshoot the cause. Enhanced logging is a feature that collects information useful for troubleshooting, such as Windows registry keys, SSH command executions, and file versions, during a scan.

Following is a sample from an Enhanced logging file:

1
<ace:collected_object>
2
<ace:source-id>0</ace:source-id>
3
<ace:thread-activity>do-unix-create-system-fingerprint@example.com:22</ace:thread-activity>
4
<ace:remote_execution_item id="42">
5
<ace:command>freebsd-version</ace:command>
6
<ace:rc datatype="int">0</ace:rc>
7
<ace:stdin status="does not exist"/>
8
<ace:stdout>10.0-RELEASE
9
</ace:stdout>
10
<ace:stderr status="does not exist"/>
11
<ace:start_time datatype="int">1443208125966</ace:start_time>
12
<ace:end_time datatype="int">1443208125982</ace:end_time>
13
</ace:remote_execution_item>
14
</ace:collected_object>

Using this feature involves two major steps:

  1. Run a scan with a template that has Enhanced logging fully enabled on assets where inaccurate results are occurring.
  2. Send a file containing Enhanced logging data to Technical Support.

It is recommended that you scan individual assets or small sites with Enhanced Logging enabled.

Enable Enhanced logging in a custom scan template

Enhanced Logging is enabled by default on the Asset Configuration Export scan template. You may, however, want to scan with a custom template which has been tuned to perform better in your specific environment. To enable Enhanced logging on a custom scan template:

  1. On the Administration page, click Scans > Templates.
  2. In the configuration of the new template, click the Logging tab.
  3. Select the Enhanced logging check box to enable Enhanced logging.
  4. Configure the rest of the template as desired and save it.

Run an authenticated scan with the Enhanced logging-enabled template

If you want to scan an entire site with the template, add it to a site configuration and then scan the site. See Selecting a scan template.

Enhanced logging gathers a significant amount of data, which may impact disk space, depending on the number off assets you scan.

If you want to manually scan a specific asset with the template, add the template in the scan dialog. See Running a manual scan.

Retrieve the collected Enhanced logging data and send it to Technical Support

Access the summary page of the desired scan to download the enhanced logging scan data:

  1. On the Home tab of your Security Console, browse to the "Sites" table.
  2. Click the corresponding link in the "Scan Status" column to open the summary page for your desired scan.
  3. Click Download > Scan data.
  4. Send the downloaded zip file to Rapid7 Support via the Support Portal.

NOTE

If the zip file is larger than 25MB, a Rapid7 Support engineer will facilitate a secure file transfer for your use.

Reporting incorrectly identified OS

You can report incorrectly fingerprinted Operating Systems by clicking on the Report an incorrectly identified asset icon next to the listed OS on the Asset or Node page.

Running diagnostics

You can run several diagnostic functions to catch issues that may be affecting system performance.

Selecting diagnostic routines

To run diagnostics for internal application issues:

  1. Click the Administration tab, click Troubleshooting > Troubleshooting.
  2. Click the check box for each diagnostics routine you want to perform.

After performing the requested diagnostics, the Security Console displays a table of results. Each item includes a red or green icon, indicating whether or not an issue exists with the respective system component.

Addressing a failure during startup

If a subsystem critical error occurs during startup, then the application will attempt to queue an appropriate maintenance task to respond to that failure. Afterward, it restarts in maintenance mode.

If you are an administrator, you can log on and examine the cause of failure. If required, you can take certain steps to troubleshoot the issue.

Two types of recovery tasks are available:

  • DBConfig task is triggered when the application is unable to connect to the configured database. It allows you to test the database configuration settings and save it upon success.
  • Recovery task is a general recovery task that is triggered when an unknown failure occurs during startup. This is very rare and happens only when one or more of the configuration files is not found or is invalid. This task allows you to view the cause of the failure and upload support logs to a secure log server, where they can be used for troubleshooting.

The application may fail to restart in maintenance mode in case of extremely critical failures if the maintenance Web server does not have the default port 3780 available. This may happen if there is already an instance of it running, or if one or more of the key configuration files is invalid or missing. These files have extensions such as .nsc, .xml, and .userdb.

Addressing failure to refresh a session

When the Web interface session times out in an idle session, the Security Console displays the logon window so that the user can refresh the session. If a communication issue between the Web browser and the Security Console Web server prevents the session from refreshing, a user will see an error message. If the user has unsaved work, he or she should not leave the page or close the browser because the work may not be lost after the communication issue is resolved.

A communication failure may occur for one of the following reasons. If any of these is the cause, take the appropriate action:

  • The Security Console is offline. Restart the Security Console.
  • The Security Console has been disconnected from the Internet. Reconnect the Security Console to the Internet.
  • The user’s browser has been disconnected from the Internet. Reconnect the browser to the Internet.
  • The Security Console address has changed. Clear the address resolution protocol (ARP) table on the computer hosting the browser.

An extreme delay in the Security Console’s response to the user’s request to refresh the session also may cause the failure message to appear.

Resetting account lockout

When a user attempts to log on too many times with an incorrect password, the application locks out the user until the lockout is reset for that user.

The default lockout threshold is 4 attempts. A global administrator can change this parameter on the Security Console Configuration—Web Server page. See Changing the Security Console Web server default settings.

You can reset the lockout using one of the following three methods:

  • If you’re a global administrator, go to the Users page, and click the padlock icon that appears next to the locked out user's name.
  • Run the console command unlock account. Using the command console.
  • Restart the Security Console. This is the only method that will work if the locked out user is the only global administrator in your organization.

Long or hanging scans

Occasionally, a scan takes an unusually long time, or appears to have completely stopped.

It is not possible to predict exactly how long a scan should take. Scan times vary depending on factors such as the number of target assets and the thoroughness or complexity of the scan template. However, you can observe whether a scan is taking an exceptionally long time to complete by comparing the scan time to that of previous scans.

In general, if a scan runs longer than eight hours on a single host, or 48 hours on a given site, it is advisable to carry out some checks. Sometimes when a scan is started and a communication issue arises between the engine and the console, the console does not receive the message that the scan has completed. This causes a hung scan.

If you believe you have a hung scan, complete the following steps to clear it from your Security Console:

  1. Restart your Security Console and Scan Engine.
  2. Log in to your Security Console.
  3. Check the Current Scans for all Sites section on the Nexpose homepage to verify that your scan was cleared.

If your scan was cleared than there is nothing else that needs to be completed; however if the scan is still there, you complete the extra following steps:

Tip for addressing delayed scan operations

If you attempt to start, pause, resume, or stop a scan, and a message appears for a long time indicating that the operation is in progress, this may be due to a network-related delay in the Security Console's communication with the Scan Engine. In networks with low bandwidth or high latency, delayed scan operations may result in frequent time-outs in Security Console/Scan Engine communication, which may cause lags in the Security Console receiving scan status information. To reduce time-outs, you can increase the Scan Engine response time out setting. See Configuring Security Console connections with distributed Scan Engines.

Scan Engine Memory

Scan engines require a substantial amount of system memory in order to operate properly. If your scan engine has run out of memory, the following reasons may be the potential cause. If you find that your system memory issues are not due to these reasons we recommend that you contact customer support.

Hardware Requirements

Hardware requirements depend on your scanning needs. For recommendations, see the System Requirements page.

Distribute Scans for Optimal Performance.

While a single scan engine is capable of scanning in excess of 20,000 assets per day, it is recommended to distribute scans across multiple scan engines for optimal performance. At this time, only x86 architecture is supported.

The Scan Engine host is not dedicated

Running other major applications at the same time as the Scan Engine host consumes necessary memory. Confirm that your Scan Engine host is dedicated before attempting other troubleshooting methods.

Not enough memory for a given number of assets

A Scan Engine's memory requirements increase with the number of assets scanned. A rough estimate of the recommended memory requirements for various asset volumes is presented in the system requirement documentation. If you need to scan additional assets, increase your system’s memory by adding additional RAM and hard drive storage.

Too many simultaneous assets for given resources

Independent of how many assets a Scan Engine can handle overall, there is also a limit to the number of assets an engine can scan simultaneously. If a template is configured to scan too many assets at the same time or if there are multiple scans running concurrently it can lead to memory issues. Simultaneous scanning of assets contributes to the total asset scan limit. If you believe this to be the potential cause of your memory issues, try scanning fewer assets per group.

Too many total assets

There is an upper threshold for the number of assets that a Scan Engine can handle both simultaneously and overall. It is not recommended to run authenticated scans against more than 20,000 assets per Scan Engine or for more than 400 concurrent assets. Exceeding these recommendations may cause the Scan Engine to run out of memory.

Too many simultaneous threads per scan

We recommend the following thread limits per scan as dictated by host resources.

  • There should be no more than 50 threads per 4GB of RAM.
  • There should be no more than 25 threads per processor core during an authenticated scan. Exceeding thread recommendations with a misconfigured scan template or through initiating simultaneous scans can lead to potential memory issues.
Using a Local Scan Engine

We do not recommend scanning more than 1000 assets simultaneously when using a Local Scan Engine. Exceeding this recommended asset limit can lead to memory issues. If you intend to deploy a production scanning environment on a larger scale, we recommend a Distributed Scan Engine.

Scan Engine host is running an unsupported operating system

We recommend utilizing an operating system from our supported list. Other operating systems have not been tested and using an unsupported OS could result in performance and memory issues.

Scan complexity

For every target host that it discovers, the application scans its ports before running any vulnerability checks. The range of target ports is a configurable scan template setting. Scan times increase in proportion to the number of ports scanned.

In particular, scans of UDP ports can be slow, since the application, by default, sends no more than two UDP packets per second in order to avoid triggering the ICMP rate-limiting mechanisms that are built into TCP/IP stacks for most network devices.

To increase scan speed, consider configuring the scan to only examine well-known ports, or specific ports that are known to host relevant services. See Working with scan templates and tuning scan performance.

Scan Engine offline

If the Scan Engine goes off line during the scan, the scan will appear to hang. When a Scan Engine goes off line during the scan, the database will need to remove data from the incomplete scan. This process leaves messages similar to the following the scan log: DBConsistenc3/10/09 12:05 PM: Inconsistency discovered for dispatched scan ID 410, removing partially imported scan results...

If a Scan Engine goes offline, restart it. Then, go the Scan Engine Configuration panel to confirm that the Scan Engine is active. See the Distributed Scan Engines page for more information.

Viewing the scan log

You can download an activity log for a completed scan or send a scan data package directly to Support for troubleshooting purposes.

To access these log options:

  1. On the Home page of your Security Console, browse to the Sites table. Click the name of the site you want to inspect to open it.
  2. In the Site Scan Summary section, click View Scan History.
  3. Browse to the Past Scans table:
  • To download a scan log for the past scan of your choosing, click the corresponding icon in the Download Log column.
  • To send a scan data package to Support for troubleshooting purposes, click the icon in the Send log column.

Alternatively, scan log and data options are also available from the Scan Progress section of the past scan detail view. Click the date and time link in the Completed column to access this page for a particular past scan.

Scan data packages

The Scan data package also includes the scan log, but is much larger and contains information usable only by the Support team for troubleshooting purposes.

Scan stopped by a user

If another user stops a scan, the scan will appear to have hung. To determine if this is the case, examine the log for a message similar to the following:

1
InsightVM3/16/09 7:22 PM: Scan [] stopped: "maylor" <>

See Viewing the scan log.

Long or hanging reports

Occasionally, report generation will take an unusually long time, or appear to have completely stopped. You can find reporting errors in the Security Console logs.

Reporting memory issues

Report generation can be slow, or can fail, due to memory issues. See Out-of-memory issues.

Stale scan data

Database speed affects reporting speed. Over time, data from old scans will accumulate in the database. This causes the database to slow down.

If you find that reporting has become slow, look in the Security Console logs for reporting tasks whose durations are inconsistent with other reporting tasks, as in the following example:

1
nsc.log.0:Reportmanage1/5/09 3:00 AM: Report task serviceVulnStatistics finished in 2 hours 1 minute 23 seconds

You can often increase report generation speed by cleaning up the database. Regular database maintenance removes leftover scan data and host information. See Viewing the scan log and Database backup/restore and data retention.

Out-of-memory issues

Scanning and reporting are memory-intensive tasks, so errors related to these activities may often be memory issues. You can control memory use by changing settings. Some memory issues are related how system resources are controlled.

java.lang.OutofMemoryError

If the application has crashed, you can verify that the crash was due to lack of memory by checking the log files for the following message: java.lang.OutOfMemoryError: Java heap space

If you see this message, contact Technical Support. Do not restart the application unless directed to do so.

Fixing memory problems

Since scanning is memory-intensive and occurs frequently, it is important to control how much memory scans use so that memory issues do not, in turn, affect scan performance. There are a number of strategies for ensuring that memory limits do not affect scans.

Reduce scan complexity

As the number of target hosts increases, so does the amount of memory needed to store scan information. If the hosts being scanned have an excessive number of vulnerabilities, scans could hang due to memory shortages.

To reduce the complexity of a given scan, try a couple of approaches:

  • Reduce the number of target hosts by excluding IP addresses in your site configuration.
  • Reduce the number of target vulnerabilities by excluding lower-priority checks from your scan template.

After patching any vulnerabilities uncovered by one scan, add the excluded IP addresses or vulnerabilities to the site configuration, and run the scan again.

For more information, see Distributed Scan Engines and Working with scan templates and tuning scan performance.

Reduce Scan Count

Running several simultaneous scans can cause the Security Console to run out of memory. Reduce the number of simultaneous scans to conserve memory.

Upgrade Hosts

If scans are consistently running out of memory, consider adding more memory to the servers. To add memory, it might be necessary to upgrade the server operating system as well. On a 64-bit operating system, the application can address more memory than when it runs on a 32-bit operating system. However, it requires 8 Gb of memory to run on a 64-bit operating system.

See the following chapters for more detailed information on making scans more memory-friendly:

Update failures

Occasionally, system updates will be unsuccessful. You can find out why by examining the system logs.

Corrupt update table

The application keeps track of previously-applied updates in an update table. If the update table becomes corrupt, the application will not know which updates need to be downloaded and applied.

If it cannot install updates due to a corrupt update table, the Scan Console log will contain messages similar to the following:

1
AutoUpdateJo3/12/09 5:17 AM: NSC update failed: com.rapid7.updater.UpdateException: java.io.EOFException
2
at com.rapid7.updater.UpdatePackageProcessor.getUpdateTable(Unknown Source)
3
at com.rapid7.updater.UpdatePackageProcessor.getUpdates(Unknown Source)
4
at com.rapid7.updater.UpdatePackageProcessor.getUpdates(Unknown Source)
5
at com.rapid7.nexpose.nsc.U.execute(Unknown Source)
6
at com.rapid7.scheduler.Scheduler$_A.run(Unknown Source)

If this occurs, contact Technical Support. See Viewing the scan log(doc:troubleshooting#viewing-the-scan-log.

Interrupted update

By default, the application automatically downloads and installs updates. The application may download an update, but its installation attempt may be unsuccessful.

You can find out if this happened by looking at the scan log.

Check for update time stamps that demonstrate long periods of inactivity.

1
AU-BE37EE72A11/3/08 5:56 PM: updating file: nsc/htroot/help/html/757.htm
2
NSC 11/3/08 9:57 PM: Logging initialized (system time zone is SystemV/PST8PDT)

You can use the update now command prompt to re-attempt the update manually:

  1. On the Administration page, click Troubleshooting > Run Commands. The Command Console page appears.
  2. Enter the command update now in the text box and click Execute.

The Security Console displays a message to indicate whether the update attempt was successful. See Viewing the scan log.

Corrupt File

If the application cannot perform an update due to a corrupt file, the Scan Console log will contain messages similar to the following:

1
AU-892F7C6793/7/09 1:19 AM: Applying update id 919518342
2
AU-892F7C6793/7/09 1:19 AM: error in opening zip file
3
AutoUpdateJo3/7/09 1:19 AM: NSC update failed: com.rapid7.updater.UpdateException:
4
java.util.zip.ZipException: error in opening zip file
5
at com.rapid7.updater.UpdatePackageProcessor.B(Unknown Source)
6
at com.rapid7.updater.UpdatePackageProcessor.getUpdates(Unknown Source)
7
at com.rapid7.updater.UpdatePackageProcessor.getUpdates(Unknown Source)
8
at com.rapid7.nexpose.nsc.U.execute(Unknown Source)
9
at com.rapid7.scheduler.Scheduler$_A.run(Unknown Source)

If the update fails due to a corrupt file, it means that the update file was successfully downloaded, but was invalid. If this occurs, contact Technical Support. See Viewing the scan log.

Interrupted connection to the update server

If a connection between the Security Console and the update server cannot be made, it will appear in the logs with a message similar to the following.

1
AU-A7F0FF3623/10/09 4:53 PM: downloading update: 919518342
2
AutoUpdateJo3/10/09 4:54 PM: NSC update failed: java.net.SocketTimeoutException

The java.net.SocketTimeoutException is a sign that a connection cannot be made to the update server. If the connection has been interrupted, other updates prior to the failure will have been successful.

You can use the update now command prompt to re-attempt the update manually. See Interrupted update and Viewing the scan log.