TMC self managed addon for Cloud Director returns 'API error: Unavailable: Please try again later
search cancel

TMC self managed addon for Cloud Director returns 'API error: Unavailable: Please try again later

book

Article ID: 398520

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

Cloud Director 10.x Addon "TMC Self managed" running on CSE environment might return the error "API error: Unavailable: Please try again later"

error: 

you see frequent restarts for the pods  (kafka and Prometheus) in tmc-local namespace

Kafka logs show:

[2159-09-09 01:08:58,596] ERROR [broker-0-to-controller-heartbeat-channel-manager]: Request BrokerHeartbeatRequestData(brokerId=0, brokerEpoch=4, currentMetadataOffset=45928312, wantFence=false, wantShutDown=false) failed due to authentication error with controller (kafka.server.BrokerToControllerRequestThread)
> org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
> Caused by: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed
>         at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
>         at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)
>         at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)
>         at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)
>         at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1357)
>         at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1232)
>         at java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1175)
>         at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)
>         at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
>         at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
>         at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
>         at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
>         at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
>         at org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
>         at org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
>         at org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
>         at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
>         at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)
>         at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)
>         at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
>         at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:571)
>         at kafka.common.InterBrokerSendThread.pollOnce(InterBrokerSendThread.scala:78)
>         at kafka.server.BrokerToControllerRequestThread.doWork(BrokerToControllerChannelManager.scala:418)
>         at org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:127)

Environment

Cloud Director 10.5.x
TMC 1.0

Cause

Internal certificates used for Kafka has expired. This cert typically has a 6 month validity.

 

Resolution

This is a known issue on TMC Self Managed addon 1.0. for Cloud Director. 
Contact technical support and note this Knowledge Article ID (398520) in the problem description. For more information, see How to Submit a Support Request