Diagnostics
Copy link

Use the Diagnostics section of Cloud Security (InsightCloudSec) to monitor system health, API activity, logs, or background jobs or for assistance with troubleshooting issues. The Diagnostics page can be found under Settings > Diagnostics and contains four tabs:

  • System Health: Provides details on overall system and worker status as well as diagnostic reports, job backlog settings, and job information. See Explore system health for more information. If you’re interested in configuring a job backlog export to understand the current load on your environment, refer to General.
  • API Activity: Lists API activity for your Cloud Security (InsightCloudSec) environment, including user name, path (endpoint), request status, and method.
  • Logs: Displays a running list of activity occurring in your Cloud Security (InsightCloudSec) environment.
  • Background Jobs: Displays a running list of background jobs available in your Cloud Security (InsightCloudSec) environment. Background jobs include Insight checks, integration agent processors, database tasks, and more.

Explore system health
Copy link

System health is divided into two sections: General and Job Information.

General
Copy link

From the General section, you can:

Monitor health checks

The Health Check table contains a list of the most important aspects of the Cloud Security (InsightCloudSec) system.

⚠️

Job Backlog, Daily Queue, and Daily Job Duration Action

Consult your Cybersecurity Advisor or Support before performing the following actions:

  • For the Job Backlog checks, you can click Clear queue (trash can) to reset and clear the selected priority job backlog queue.
  • For Daily Queue and Daily Job Duration checks, you can click Reset stats to reset the statistics for the check.

Download system diagnostic reports

ℹ️

System diagnostics availability

System Diagnostic content is only visible for self-hosted users. For SaaS users, the Cloud Security (InsightCloudSec) production services team manages diagnostics and has access to extensive data in support of hosted installations.

If you have questions or issues with your deployment contact your CSA, or reach out to support through the Customer Support Portal.

To download a diagnostic report:

  1. Select a report type from the drop-down menu.
  2. Click Download Report. The file will be prepared and downloaded.

Configure the job backlog export

The job backlog is a great way to understand the current load on your environment. Exporting job backlog metrics can be used to drive improvements from a scaling perspective and provide historical data for up to 15 months. Cloud Security (InsightCloudSec) provides support for exporting job backlog metrics to Amazon Web Services (AWS) AND Google Cloud Platform (GCP).


For AWS:

Before getting started, you need Domain Admin permissions in Cloud Security (InsightCloudSec). You also need to ensure the target AWS account has already been added to Cloud Security (InsightCloudSec) and the associated harvesting role has the "cloudwatch:PutMetricData" permission.

To turn on job backlog exporting to AWS CloudWatch in Cloud Security (InsightCloudSec):

  1. Log in to Cloud Security (InsightCloudSec).
  2. Go to Settings > Diagnostics > System Health > General.
  3. Under Job Backlog Settings, select an AWS Target Account.
  4. Enter a Target Region and Target Namespace.
  5. Optionally, select Use Instance Authentication if you onboarded your account using the Instance Assume Role method. In most cases, you do not need to use instance authentication.
  6. Click Save.

In the AWS CloudWatch, this information will appear in your Custom Namespaces under Metrics.


For GCP:

Before getting started, you need Domain Admin permissions in Cloud Security (InsightCloudSec). You also need to ensure the following:

  • The target GCP account has the already been added to Cloud Security (InsightCloudSec)
  • The target GCP account has the Stackdriver API turned on
  • The associated harvesting role has the "monitoring.metricDescriptors.create" and "monitoring.timeSeries.create" permissions.

To turn on job backlog exporting to AWS CloudWatch in Cloud Security (InsightCloudSec):

  1. Log in to Cloud Security (InsightCloudSec).
  2. Go to Settings > Diagnostics > System Health > General.
  3. Under Job Backlog Settings, select a GCP Target Account.
  4. Click Save.

In GCP Stackdriver, this information will appear as the following metrics:

  • job_backlog_standard
  • job_backlog_high_priority
  • job_backlog_immediate

Job Information
Copy link

From the Job Information section, you can:

Monitor the job scheduler

The Job Scheduler Information pane can be used to refresh the active job scheduler. While Cloud Security (InsightCloudSec) is only architected for one scheduler, a common deployment practice is to have a secondary scheduler as a High Availability (HA) failover option. This pane displays which scheduler is currently the master, the host for each scheduler, the time each job scheduler last sent a heartbeat to Redis, and the status of any plugins that have been applied to the schedulers. Users also have the ability to flush the Redis cache and address issues that cannot be otherwise resolved or managed. If you have questions about this feature, you should reach out to Customer Support. See Product Architecture to learn more about the role of the scheduler within Cloud Security (InsightCloudSec)‘s overall workflow.

Explore the slowest jobs in your system

The Slowest Jobs pane displays details for the most recent slowest jobs in your system, including:

  • Most Recent — Name of the most recent job.
  • Cloud Type — Icon to specify the applicable cloud type.
  • Longest recorded run (seconds) — Length of the longest recorded run for the applicable job.

These jobs typically reflect very large jobs or global harvesting for items like Storage Containers, WAF, or IAM.

Monitor worker node health

The Worker Node Status table displays general worker node status, including:

  • Host - The unique host identifier.
  • IP Address - The IP address for the worker node.
  • Status - The status for the individual worker node.