UCF: Error "FailedScheduling" for vault-1 and kafka-cluster pods in DX NetOps all-in-one deployment

search cancel

UCF: Error "FailedScheduling" for vault-1 and kafka-cluster pods in DX NetOps all-in-one deployment

book

Article ID: 433621

calendar_today

Updated On:

Products

Network Observability

Issue/Introduction

During an all-in-one Kubernetes cluster deployment, the deployment fails to complete ․‌‍

ERROR MESSAGE: "Warning FailedScheduling default-scheduler 0/2 nodes are available: 2 node(s) didn't match pod anti-affinity rules․"

SYMPTOMS:

The vault-1 pod stays pending due to an anti-affinity rule ․
The kafka-cluster application deployment fails ․
The Kafka cluster cannot determine a quorum leader ․

CONTEXT: This occurs when running an all-in-one installation of the product on a single node ․

Environment

OS: RHEL 9․7
Architecture: All-in-one Kubernetes cluster

Cause

By default, enterprise Vault and Kafka Helm charts are configured for High Availability ․ This means they try to spin up multiple replicas and enforce a strict pod anti-affinity rule ․ This rule dictates to the Kubernetes scheduler to never put replica pods on the same node ․ In an all-in-one build, the scheduler cannot find a second node, leaving the pods in a pending state ․

Resolution

Step 1․ MODIFY CONFIGURATION FOR NON-HA DEPLOYMENT

Path: values․yaml

Update the platform configuration to set high availability to false:

For 25.4.4

platform: 

highAvailability: false

For 25.4.5 and above:

global: 

highAvailability: false

Step 2․ CLEAN UP EXISTING KAFKA RESOURCES

Delete the existing Kafka pods, controllers, Persistent Volume Claims, and Persistent Volumes ․

EXPECTED: The system state is cleared, allowing the installer to recreate the cluster properly ․

Step 3․ RE-RUN THE DEPLOYMENT

Execute the deployment command ․

EXPECTED: The pods deploy successfully without anti-affinity rule conflicts ․

Feedback

thumb_up Yes

thumb_down No