Edge node fails to complete deployment in an NSX-T federation setup on NSX-T 3.2.0.x.
search cancel

Edge node fails to complete deployment in an NSX-T federation setup on NSX-T 3.2.0.x.

book

Article ID: 317785

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction


- The edge node fails to complete deployment. "Registration Timedout"

Nullpointer exception "msg=ChannelTask exception: java.lang.NullPointerException" is observed in the NSX-T manager /var/log/proton/nsxapi.log for the 'CcpRpcClient'


2022-xx-xxT23:07:54.823Z INFO nsx-rpc:CcpRpcClient:user-executor-0 NsxRpcChannel 5284 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] ServerChannel(vmware.nsx.lldp.LldpMpService, dabcdefg-abcd-efgh-ijkl-42932dxxxxxx).doClose(closeStream=true, status=Status(code=UNKNOWN, msg=ChannelTask exception: java.lang.NullPointerException))

2022-xx-xxT23:07:54.823Z INFO nsx-rpc:CcpRpcClient:user-executor-0 NsxRpcChannel 5284 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] ServerChannel(vmware.nsx.lldp.LldpMpService, dabcdefg-abcd-efgh-ijkl-42932dxxxxxx) doCloseStream OutboundMessage(streamId=null payloadSize=0 streamControl=close_stream: true

close_status {

  code: UNKNOWN

  error_msg: "ChannelTask exception: java.lang.NullPointerException\n"

}

 frame=)

 

Environment

VMware NSX-T Data Center

Cause

When a process (In this case 'Policy') wants to establish connection to APH (Appliance Proxy Hub) endpoint, they use UDS endpoint(unix:///var/run/vmware/appl-proxy/aph.sock).
There is a UDS limitation which states that NO two (2) RPC providers from the same process can connect to the same UDS endpoint. In the event of multiple connections from the same process the endpoint retains the latest connection.

In some cases where there were 2 RPC providers such as messaging-lib and CcpRpcClient, one of the providers (messaging-lib) registers for the services and the other (CcpRpcClient) does not. If messaging-lib connects first and the CcpRpcClient connects last, The APH will keep the second connection. It will cause all services on the policy process to become unavailable.

In this case the EdgeMpService may be impacted, which will cause the Edge registration to fail.

Note:
----
This issue impacts the Federation setup/Local Managers with the NSX-T 3.2.0.x only.

Resolution

Update to NSX-T 3.2.1 or newer.

Workaround:
- Restart the proton services (/etc/init.d/proton restart or service proton restart) on the impacted NSX-T manager

or

- Reboot the Impacted NSX-T manager.


Additional Information

Impact/Risks:
New NSX-T Edge Nodes cannot be created successfully.