NSX Environment Issue: Missing ACLs from Some Kafka Topics
search cancel

NSX Environment Issue: Missing ACLs from Some Kafka Topics

book

Article ID: 345777

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • The pod named llanta-detectors-0 is unable to maintain operation for more than approximately one minute.
  • Several services are encountering issues accessing the Kafka topics they depend on.
  • The necessary Kafka Access Control Lists (ACLs) for such access are absent.
  • Version where this is a known issue - NSX 4.0.1

Relevant Logs Location:

The following log line can be seen in trust manager pod logs

                     Fail to start controller keystore-reload-controller: Timed out waiting for cache to be synced.

Environment

VMware NSX 4.0.0.1

Cause

A third-party controller is being used for trust-manager, which encounters an issue where it stalls without raising an exception when this error occurs.

Resolution

Issue is resolved in NSX Version 4.2.0

Workaround:
  • Restarting the trust-manager pod can refresh the ACLs and restore the keystore-reload-controller to its functional state. 
    • Login to the NSX Manager applaince with root account.
    • At the system prompt, run the following command to restart the pods.
      • napp-k rollout restart deployment trust-manager 
    • Wait for the  trust-manager pod to restart successfully.

  • To verify ACLs, execute the following command:

napp-k exec -it `napp-k get pods | grep cluster | cut -d ' ' -f 1` -c cluster-api -- sh -c "/opt/kafka/bin/kafka-acls.sh  --bootstrap-server kafka:9092 --command-config /root/adminclient.props --list"


Additional Information

Impact/Risks:
Customers would not be able to observe traffic or mark recommendations.