Activation of Security intelligence fails with Precheck Error "Failed to invoke API on NAPP (NSX Application Platform). Please check logs on pre-check job pod for more information"
search cancel

Activation of Security intelligence fails with Precheck Error "Failed to invoke API on NAPP (NSX Application Platform). Please check logs on pre-check job pod for more information"

book

Article ID: 375300

calendar_today

Updated On:

Products

VMware vDefend Firewall with Advanced Threat Prevention VMware vDefend Firewall

Issue/Introduction

On Security Intelligence Activation Screen , we will see an error similar to :

Also intelligence precheck jobs fail on Cluster Resource Capacity Check

Note : Container name for the same is : validate-capacity 

napp-k logs job/nsx-intelligence-precheck-jobs -c validate-capacity -n nsxi-platform


Found 4 pods, using pod/nsx-intelligence-precheck-jobs-w8qmk
/usr/lib/python3/dist-packages/urllib3/connectionpool.py:1020: InsecureRequestWarning: Unverified HTTPS request is being made to host 'cluster-api'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  warnings.warn(
/usr/lib/python3/dist-packages/urllib3/connectionpool.py:1020: InsecureRequestWarning: Unverified HTTPS request is being made to host 'monitor'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  warnings.warn(
/usr/lib/python3/dist-packages/urllib3/connectionpool.py:1020: InsecureRequestWarning: Unverified HTTPS request is being made to host 'cluster-api'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  warnings.warn(
/usr/lib/python3/dist-packages/urllib3/connectionpool.py:1020: InsecureRequestWarning: Unverified HTTPS request is being made to host 'cluster-api'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  warnings.warn(
calling POST https://cluster-api:443/report/precheck {'id': 'capacity', 'name': 'feature.precheck.capacityName', 'desc': 'feature.precheck.capacityDesc', 'feature': 'intelligence', 'status': 'INPROGRESS', 'reason': ''}
REST OK
Calling Monitor service to get platform information
calling GET https://monitor:443/api/v1/platform/monitor/platform/status None
REST OK
Calling cluster-api to check if capacity exists to deploy feature
calling POST https://cluster-api:443/features/intelligence/capacity/validate {'remaining_cpu_percent': 25, 'remaining_memory_percent': 25}
calling POST https://cluster-api:443/report/precheck {'id': 'capacity', 'name': 'feature.precheck.capacityName', 'desc': 'feature.precheck.capacityDesc', 'feature': 'intelligence', 'status': 'FAILED', 'reason': 'feature.precheck.nappApiInvocationFailedReason'}
REST OK
REST FAILED: Internal Server Error
Traceback (most recent call last):
  File "/opt/vmware/nsxi/hombre/__main__.py", line 129, in <module>
    main()
  File "/opt/vmware/nsxi/hombre/__main__.py", line 113, in main
    capacity_plugin.check()
  File "/opt/vmware/nsxi/hombre/precheck/plugin_capacity.py", line 54, in check
    json_response = requests_wrapper.incluster_request('POST', url, data=req_data)
  File "/opt/vmware/nsxi/hombre/utils/requests_wrapper.py", line 56, in incluster_request
    return _invoke_requests(method=method, url=url, data=data,
  File "/opt/vmware/nsxi/hombre/utils/requests_wrapper.py", line 42, in _invoke_requests
    raise Exception("REST FAILED: {reason}"
Exception: REST FAILED: Internal Server Error

 

 

cluster-api pod's logs shows log lines similar to below :

2024-08-20T11:52:15.576268979Z stdout F {"time":"2024-08-20T11:52:15.576196673Z","level":"ERROR","prefix":"-","file":"service.go","line":"136","message":"failed to fetch licenses from configmap: invalid character 'A' looking for beginning of value"}
2024-08-20T11:52:15.576298914Z stdout F AUDIT: method=POST uri=/features/intelligence/capacity/validate remote_ip=192.168.2.48 host=cluster-api id= latency=456.998264ms status=500 error= user=

 

 

Environment

Can affect any NAPP version

Cause

The license configmap does not contain any licenses i.e. common-agent on NSX has not yet streamed the license information.

 

Resolution

This can be forced by re-applying a license on NSX side - which would trigger common-agent to stream the info - thus updating the configmap.

 

1. Run the following command on the NSX manager that is the leader for COMMON_AGENT_SERVICE

keytool -list -keystore /home/secureall/secureall/.store/.napp_kafka_keystore -storepass $(cat /home/secureall/secureall/.store/.napp_kafka_keystore_pw)

If you're able to list the keystore with the password successfully, then perform step (2)
Else perform step (3)

2. If (1) worked - restart common-agent i.e. restart proton on the NSX manager that is the leader for COMMON_AGENT_SERVICE

   2.1 su admin -c "get cluster status verbose" | grep COMMON_AGENT_SERVICE -> gives the uuid of the nsx manager where the agent is running
   2.2 su admin -c "get cluster status" | grep <uuid-of-manager-from-2.1> -> gives IP address of manager where common-agent is running
   2.3 ssh root@<nsx-ip-from-2.2>; systemctl restart proton -> restart proton/common-agent

3. If (1) did not work - regenerate common-agent kafka client certificate.

    3.1 go to System > Certificates
    3.2 filter certificates by 'Issued To: k8s-msg-client'
    3.3 click on 3 dots to the left of 'message bus client for NSX Application platform' - select 'Replace Certificate' 
    3.4 choose 'Generate Self Signed Certificate' and 'Save'

 

NOTE : You may not see the "Replace Certificate" option in some old releases of NSX Manager