This article describes details regarding 5 Alarms that may be raised in the context of the NSX Malware Prevention functionality.
Product Versions: NSX 4.0.1.1
Following is observed on NSX UI:
"Service Malware Prevention is not running on <transport_node_name_or_ip>."
NOTE: "Service Status Down" alarm corresponds to all the Malware Prevention Service services (NSX Security Hub, NSX RAPID) running on Malware Prevention Service SVM and NSX Edge node.
Service Status down alarm is generated when any of the following is true:
In order to correctly identify the Service and the corresponding Transport Node that lost connectivity, following should be done.
[ Note :Please also refer to KB NSX Malware Prevention Service VM Fails To Register With NSX to see possible resolutions ]
Service Status down alarm is generated when any of the following is true:
In order to correctly identify the Service and the corresponding Service instance that lost connectivity with the host module following should be done.
All Malware Prevention Service functionality on the NSX Edge or the service virtual machine (SVM) is provided by NSX Security Hub and NSX RAPID services. Service down would lead to loss of malware prevention for Network Files Extracted on the NSX Edge or for the workload VMs on the impacted ESXi host where such service loss is observed.
Service failures could be temporary and may restore on its own. In such case alarm will move to "Resolved" state.
If this doesn't happen in minutes then it is advisable to migrate critical workload VMs to another host in the same cluster where the security service is running and healthy. This will ensure continued protection for the important assets.
Following is observed on NSX UI
"Service Malware Prevention is degraded on <transport_node_name_or_ip>. Unable to communicate with file extraction functionality. All file extraction abilities on the <transport_node_name_or_ip> are paused."
File extraction service unreachable alarm is generated when any of the following is true:
In order to correctly identify the Service and the corresponding Transport Node that lost connectivity, following should be done.
File extraction service unreachable alarm is generated when any of the following is true:
In order to correctly identify the Service and the corresponding Service instance that lost connectivity with the host module following should be done.
File Extraction Service Unreachable would lead to loss of malware prevention for Network Files Extracted on the NSX Edge or for the workload VMs on the impacted ESXi host where such service loss is observed.
File Extraction service unreachable could be temporary and may restore on its own. In such case alarm will move to "Resolved" state.
If this doesn't happen in minutes then it it is advisable to migrate critical workload VMs to another host in the same cluster where the file extraction service(s) are reachable. This will ensure continued protection for the important assets.
Following is observed on NSX UI
"Service Malware Prevention functionality is degraded on NSX Application Platform. It is unable to communicate with Malware Prevention database."
Database unreachable alarm is generated when any of the following is true:
postgresql-ha-*
> are not running or the corresponding service(s) are not responding to the health probeThe issue is specific to the NSX Application Platform.
Database unreachable issue would lead to the loss of following malware prevention functionality.
Database unreachable could be temporary and may restore on its own. In such a case alarm will move to "Resolved" state.
If this doesn't happen in minutes then it it is advisable to collect the NSX Application platform support bundle and raise a support ticket with VMware support team.
Following is observed on NSX UI
"Service Malware Prevention is degraded on NSX Application Platform. It is unable to communicate with analyst_api service. Inspected file verdicts may not be up to date."
Analyst API Service Unreachable alarm is generated when any of the following is true:
The issue is specific to the NSX Application Platform.
Analyst API Service unreachable issue would lead to the loss of following malware prevention functionality.
In the short term, customers who wish to mitigate the false positives may leverage the allow-listing capability of Malware Prevention Service functionality. Analyst API Service unreachable could be temporary and may restore on its own. In such a case alarm will move to "Resolved" state.
If this doesn't happen in minutes then it is advisable to collect the NSX Application platform support bundle and raise a support ticket with VMware support team.
Symptoms
Following is observed on NSX UI
Alarms dashboard shows "NTICS Reputation Service Unreachable" alarm for "Malware Prevention" feature. Alarm state is open. The Alarm description states -
"Service Malware Prevention is degraded on the NSX Application Platform. It is unable to communicate with external NTICS service. Inspected file verdicts may not be up to date."
Cause
On NSX Application Platform
NTICS Reputation Service Unreachable alarm is generated when any of the following is true:
On the NSX Application Platform, Malware Prevention Reputation Service is not able to access the cloud hosted NTICS services.
The issue is specific to the NSX Application Platform.
Impact / Risks
NTICS Service unreachable issue would lead to the loss of following malware prevention functionality.
There is an increased possibility of false positives (e.g., files being marked as Malicious even when they are published by a trusted source).
Resolution
In the short term, customers who wish to mitigate the false positives may leverage the allow-listing capability of Malware Prevention Service functionality. Ensure your NSX proxy settings are configured correctly and outbound network access is allowed to external NTICS service. NTICS service unreachable could be temporary and may restore on its own. In such a case the alarm will move to "Resolved" state.