In some SSP environments integrated with NSX, applications may fail to reach a SUCCESS state and application recommendation may not start, even though the application is successfully published and visible.
Symptoms
Kafka connectivity broken logs in nsxapi.log
INFO kafka-producer-network-thread | producer-1621088 NetworkClient 5239 [Producer clientId=producer-1621088] Node 0 disconnected.
ERROR kafka-producer-network-thread | producer-1621088 NetworkClient 5239 [Producer clientId=producer-1621088] Connection to node 0 (/x.x.x.x:9092) failed authentication due to: Failed to process post-handshake messages
INFO kafka-producer-network-thread | producer-1621088 NetworkClient 5239 [Producer clientId=producer-1621088] Cancelled in-flight API_VERSIONS request with correlation id 2464803 due to node 0 being disconnected (elapsed time since creation: 1ms, elapsed time since send: 1ms, request timeout: 30000ms)
INFO kafka-producer-network-thread | producer-1229 Selector 5239 [Producer clientId=producer-1229] Failed re-authentication with /x.x.x.x (channelId=0) (Failed to process post-handshake messages)
INFO kafka-producer-network-thread | producer-1229 NetworkClient 5239 [Producer clientId=producer-1229] Node 0 disconnected.
ERROR kafka-producer-network-thread | producer-1229 NetworkClient 5239 [Producer clientId=producer-1229] Connection to node 0 (/x.x.x.x:9092) failed authentication due to: Failed to process post-handshake messages
Due to this kafka connectivity issues, full sync goes into error state initially and then into loop of fullsync
INFO Thread-21216 FullConfigProducerImpl 5239 INTELLIGENCE [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Start sending intelligence full config to NSX Intelligence, SSP.
INFO Thread-21218 FullConfigProducerImpl 5239 INTELLIGENCE [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Start sending intelligence full config to NSX Intelligence, SSP.
INFO Thread-21222 FullConfigProducerImpl 5239 INTELLIGENCE [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Start sending intelligence full config to NSX Intelligence, SSP.
Every 5 minutes new full sync is getting triggered.
SSP : 5.1 , 5.1.1
This issue is observed in NSX 4.2.x versions prior to 4.2.4
The issue is caused by a sync failure between NSX and SSP due to Kafka communication instability, which leads to:
nullAs a result, although the application is published in NSX, SSP cannot mark it as SUCCESS, causing recommendation workflows to fail.
Restart Proton services on all NSX Manager nodes
systemctl restart proton
This clears the sync state and triggers a fresh full synchronization between NSX and SSP.