Traceflow request failed. The request might be cancelled because it took more time than normal.
search cancel

Traceflow request failed. The request might be cancelled because it took more time than normal.

book

Article ID: 393879

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Certificates have recently been replaced on NSX Manager (this include internal certificates).

  • Traceflow will fail with error:
    Traceflow request failed. The request might be cancelled because it took more time than normal. Please retry.
    Error Message: Error: Traceflow intent /infra/traceflows/db9b####-####-####-####-########b995 realized with error Traceflow request to client 0a17####-####-####-####-########4c99 HostTransportNode '<Host_Transport_Node_Hostname>' failed (Error code: 500060)

  • In Manager logs, you can see traceflow realization errors:
    /var/log/proton/nsxapi.log
    MP tries to send the traceflow request to host e59e####-####-####-####-########fc39 at the first time.
    2025-03-20T21:10:10.545Z  WARN providerTaskExecutor-1-43 TraceflowMessagingServiceImpl 5149 MONITORING [nsx@6876 comp="nsx-manager" level="WARNING" reqId="2ac5####-####-####-####-########577dc" subcomp="manager" username="nsx_policy"] Error sending RPC to client: e59e####-####-####-####-########fc39 with Exception: Unable to reach client e59e####-####-####-####-########fc39, application Traceflow, Reties lef: 2

    MP tries to send the traceflow request to host e59e####-####-####-####-########fc39 at the second time.
    2025-03-20T21:10:12.846Z  WARN providerTaskExecutor-1-43 TraceflowMessagingServiceImpl 5149 MONITORING [nsx@6876 comp="nsx-manager" level="WARNING" reqId="2ac5####-####-####-####-########77dc" subcomp="manager" username="nsx_policy"] Error sending RPC to client: e59e####-####-####-####-########fc39 with Exception: Unable to reach client e59e####-####-####-####-########fc39, application Traceflow, Reties lef: 1

    MP cannot send the traceflow request to host e59e####-####-####-####-########fc39, followed by exception logs:
    2025-03-20T21:10:15.146Z  WARN providerTaskExecutor-1-43 TraceflowMessagingServiceImpl 5149 MONITORING [nsx@6876 comp="nsx-manager" level="WARNING" reqId="2ac5####-####-####-####-########77dc" subcomp="manager" username="nsx_policy"] Unknown Messaging Exception: Unable to reach client e59e####-####-####-####-########fc39, application Traceflow sending RPC to client: e59e####-####-####-####-########fc39
    ..
    2025-03-20T21:10:15.208Z ERROR providerTaskExecutor-1-43 MpTraceflowServiceImpl 5149 MONITORING [nsx@6876 comp="nsx-manager" errorCode="MP4912" level="ERROR" reqId="2ac5####-####-####-####-########77dc" subcomp="manager" username="nsx_policy"] Exception occurred in send message to client
    com.vmware.nsx.messaging.exceptions.MessagingException: Unable to reach client e59e#####-####-####-####-########fc39, application Traceflow
            at com.vmware.nsx.messaging.nsxrpc.StubManager.runRPC(StubManager.java:290) ~[?:?]
            at com.vmware.nsx.messaging.nsxrpc.StubManager.callRpc(StubManager.java:263) ~[?:?]
            at com.vmware.nsx.messaging.rpc.RpcManager.sendRequest(RpcManager.java:228) ~[?:?]
            at com.vmware.nsx.messaging.service.impl.ClientServiceImpl.sendRpcRequest(ClientServiceImpl.java:105) ~[?:?]
            at com.vmware.nsx.messaging.service.impl.ClientServiceImpl.sendRpcRequest(ClientServiceImpl.java:76) ~[?:?]

  • /var/log/messaging-manager/messaging-manager.log indicates certificate issue:
    2025-03-20T21:10:10.755Z ERROR netty-18 ClientHandshakeHandler - exceptionCaught: Exception DecoderException caught.
    io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
            at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499) ~[netty-codec-4.1.89.Final.jar:4.1.89.Final]
            at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[netty-codec-4.1.89.Final.jar:4.1.89.Final]
            at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.89.Final.jar:4.1.89.Final]

Environment

  • VMware NSX
  • VMware NSX-T Data Center

Cause

Traceflow was broken due to loss of connectivity between NSX Management Plane and the ESXi host due to certificate mismatch. 

Resolution

Restart NSX Managers to trigger resync of certificates between the hosts and transport nodes. 

Alternatively, certificate mismatch can be fixed by manual sync up of host certificate:

  1. SSH to any NSX Manager in the cluster as admin.
  2. Find and note the thumbprint of API certificate:
    > get certificate api thumbprint
  3. SSH to the impacted ESXi host (Transport Node in the error in Traceflow).
  4. Push the host certificate to the Management Plane:
    nsxcli -c push host-certificate <NSX Manager IP or FQDN> username admin thumbprint <thumbprint_from_step_2>

Additional Information

Also see Loss of Controller Connectivity after Host Upgrade.

If you are contacting Broadcom support about this issue, please provide the following:

  • NSX Manager support bundles.
  • ESXi host support bundles for hosts which Traceflow fails to realize on.
  • Text of any error messages seen in NSX GUI or command lines pertinent to the investigation.

Handling Log Bundles for offline review with Broadcom support: