NSX Manager reports Node Agent is down

search cancel

NSX Manager reports Node Agent is down

book

Article ID: 380524

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Node Agent Alarm is triggered
ESXi host where Node Agent is down returns a non-healthy hyperbus status.

nsxcli -c get hyperbus connection info
VIFID Connection Status HostSwitchID
[UUID] [IP]:[PORT] HEALTHY [ID]
[UUID] [IP]:[PORT] COMMUNICATION_ERROR [ID]
[UUID] [IP]:[PORT] HEALTHY [ID]

The Worker Node may show its Node Agent in a running state.

kubectl get pods -n nsx-system -o wide | <worker node>
nsx-node-agent-[ID] 3/3 Running [RESTARTS] [IP] worker-[ID]
..

Node Agent events show the following error.

kubectl describe pod nsx-node-agent-[ID] -n nsx-system
...
Events:
Type Reason Age From Message
Warning Unhealthy [AGE] kubelet (combined from similar events): Liveness probe errored: rpc error: code = Unknown desc = command error: time=[TIMESTAMP] level=error msg="exec failed: unable to start container process: error starting setns process: fork/exec /proc/self/exe: no such file or directory"
, stdout: , stderr: , exit code -1

Environment

VMware NSX

VMware NSX Container Plugin

Resolution

There's currently no resolution to this issue.

Workaround:

Redeploy the pod by running the following command.

kubectl delete pod nsx-node-agent[ID] -n nsx-system

Feedback

thumb_up Yes

thumb_down No