Symptoms
Multiple pods in the nsxi-platform namespace entered a CrashLoopBackOff state after upgrading to SSPi 5.1.
Affected pods included:
kafka
pcap
Several dependent services (e.g., common-agent) were also impacted.
Kafka logs showed repeated graceful shutdown failures and TimeoutException errors :
ERROR kafka-o-metadata-loader-event-handler MetadataLoader – [MetadataLoader id=0] initializeNewPublishers:the loader is still catching up because we still don’t know the high water mark yet.WARN kafka-o-raft-io-thread KafkaRaftClient – [RaftManager id=0] Graceful shutdown of RaftClient timed out after 5000msERROR kafka-o-metadata-loader-event-handler KafkaEventQueue – [ControllerRegistrationManager id=0]Graceful shutdown of RaftClient failedERROR kafka-o-metadata-loader-event-handler KafkaEventQueue – [StandardAuthorizer 0]Failed to complete initial ACL load process.java.util.concurrent.TimeoutExceptionWARN kafka-shutdown-hook NetworkClient – Attempting to close NetworkClient that has already been closed.INFO kafka-shutdown-hook NodeToControllerChannelManagerImpl – Node to controller channel manager shutdown completed.
SSP 5.1
The repeated TimeoutException messages and graceful shutdown failures indicate that Kafka could not complete metadata and ACL initialization within the expected timeframe.
The underlying reason was infrastructure slowness during the upgrade process.
As a result, dependent services (such as common-agent) could not establish a connection to Kafka:
failed to dial: failed to open connection to kafka:9092: dial tcp <ip>:9092: i/o timeout
1. Identify the affected pods using the following commands:
k get pod -n nsxi-platform | grep kafka
k get pod -n nsxi-platform | grep pcap
2. Delete the affected Kafka and pcap pods identified from the above output:
k delete pod <pod name> -n nsxi-platform
k delete pod <pod name> -n nsxi-platform
3. After the pods restart, verify that all components return to a healthy state using:
k get pods -n nsxi-platform
If the issue persists, contact Broadcom Support for further assistance.