When re-installing Intelligence or NDR, feature activation gets stuck at 7% and eventually fails
search cancel

When re-installing Intelligence or NDR, feature activation gets stuck at 7% and eventually fails

book

Article ID: 417912

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

If Intelligence or NDR is installed, uninstalled, and then re-installed, feature activation may fail.

This issue primarily affects deployments where Intelligence or NDR have been installed and ingesting flow data.

Environment

SSP 5.1

Cause

The redis-cache-warming job is run when security intelligence is deployed or upgraded.

This loads flow data into redis, and in environments with heavy traffic, this job can take too long and cause installation to time out.

In 5.1, this scenario is possible if the customer installs Intelligence or NDR, ingests flows, and then re-deploys either feature (i.e. uninstall then reinstall intelligence or NDR).

 

To identify if this is the issue, you may run the following from SSPI:

k get pods -n nsxi-platform | grep intelligence-install (grab failed pod name from this output)
#Check for the below logs by executing:

k logs <failed-install-pod> -n nsxi-platform

<179>1 2025-11-07T09:55:40Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="install/install.go:89" subcomp="SSP"] Error occurred while installing chart {"chartName":"analytics-common-chart","error":"Timed out waiting for resources to be realized for chart analytics-common-chart"}
<179>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="deployment/chart.go:142" subcomp="SSP"] Failed to install helm chart {"error":"context deadline exceeded"}
<180>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="WARNING" s2comp="deployment/chart.go:150" subcomp="SSP"] Chart installation failed. {"chart":"analytics-common-chart","error":"context deadline exceeded"}
<179>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="install/install.go:59" subcomp="SSP"] Error occurred while installing chart. {"ChartName":"analytics-common-chart","error":"Timed out waiting for resources to be realized for chart analytics-common-chart"}
<179>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="install/install_service.go:22" subcomp="SSP"] Error occurred while installing:  {"error":"Timed out waiting for resources to be realized for chart analytics-common-chart","feature ":"intelligence"}
<179>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="cmd/installFeature.go:46" subcomp="rudder"] Failed to install feature {"Feature":"intelligence","error":"Timed out waiting for resources to be realized for chart analytics-common-chart"}
<182>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="INFO" s2comp="feature/deployer.go:91" subcomp="rudder"] Feature installation failed {"Feature":"intelligence"}
<179>1 2025-11-07T09:55:45Z intelligence-install-djlgd-9km7q SSP 1 SSP [ssp@4413 comp="SSP" level="ERROR" s2comp="cmd/installFeature.go:46" subcomp="rudder"] Installation failed for feature {"Feature":"intelligence","error":"Feature installation failed for intelligence. Please check logs for more details."}
Error: Feature installation failed for intelligence. Please check logs for more details.

Then check the installation status:

helm ls -a -A --kubeconfig=/config/clusterctl/1/workload.kubeconfig
NAME                    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                                           APP VERSION
analytics-common        nsxi-platform   1               2025-11-07 09:56:23.044943868 +0000 UTC failed          analytics-common-chart-5.1.0-0.0-25009250       5.1.0-0.0-25009250
cert-manager            cert-manager    1               2025-11-02 14:53:25.954884049 +0000 UTC deployed        cert-manager-5.1.0-0.0-25009250                 5.1.0-0.0-25009250
metrics                 nsxi-platform   1               2025-11-02 15:12:52.510624504 +0000 UTC deployed        metrics-5.1.0-0.0-25009250                      5.1.0-0.0-25009250
nsxi-platform           nsxi-platform   1               2025-11-02 14:57:37.921086639 +0000 UTC deployed        napp-platform-advanced-5.1.0-0.0-25009250       5.1.0-0.0-25009250
projectcontour          projectcontour  1               2025-11-02 14:56:29.129400618 +0000 UTC deployed        contour-5.1.0-0.0-25009250                      5.1.0-0.0-25009250
ssp-antrea              kube-system     1               2025-11-02 14:43:25.90287663 +0000 UTC  deployed        antrea-2.4.0                                    2.4.0
ssp-metallb             metallb-system  1               2025-11-02 14:51:42.462823392 +0000 UTC deployed        metallb-6.4.19                                  0.15.2

 

Depending on which chart install failed, we can check pods that are part of the respective chart by using a label of the form: "app.kubernetes.io/instance=$RELEASE_NAME"

In the above output, analytics-common installation has failed, so we can check those pods:

k get pods -l app.kubernetes.io/instance=analytics-common -n nsxi-platform

If no pods are in an errored state, check if all jobs completed. 

k get jobs -n nsxi-platform -l app.kubernetes.io/instance=analytics-common

NAME                                                    STATUS     COMPLETIONS   DURATION   AGE
analytics-common-hooks-site-registration-intelligence   Complete   1/1           21s        6h38m
ids-kafka-provisioning                                  Complete   1/1           54s        6h38m
latestflow-create-kafka-topic                           Complete   1/1           45s        6h38m
pcap-storer-minio-bucket-configuration-pcap             Complete   1/1           40s        6h38m
pcap-storer-s3-provisioning                             Complete   1/1           5s         6h38m
processing-configure-druid                              Complete   1/1           74s        6h38m
processing-create-kafka-topic-job                       Complete   1/1           71s        6h38m
processing-s3-provisioning                              Complete   1/1           11s        6h38m
redis-cache-warming                                     Complete   1/1           72m        6h38m

 

In the above output, all jobs succeeded, but the redis-cache-warming job took 72 minutes to finish. Chart installation time outs after 30-45 minute, depending on the chart. 

If you see the redis-cache-warming job has failed or took more than 30 minutes, contact Broadcom support for further assistance.

Resolution

Contact Broadcom support via a support ticket for executing the resolution.

This issue is fixed in the next release of Security Services Platform(SSP).