Title: Alarm for CNI health status Event ID: cni_health.hyperbus_manager_connection_down Added in release: 3.0.0 Alarm Description
Purpose: Detect and report when the container network infrastructure channel is unhealthy.
Impact: Container may not become running as container network cannot be pushed down or configured.
Environment
VMware NSX-T Data Center
Resolution
On the ESXi node where cfgAgent is running:
Run 'net-stats -l' to check if Hyperbus port 4094 is missing, restarting nsx-cfgagent with command '/etc/init.d/nsx-cfgagent restart'.
Run 'net-dvs -l' to check container host VIF. If container host VIF is blocked, check the connection with Controller to make sure all configurations are sent down. For example: In the first step, use 'net-stats -l' to find Hyperbus PortNum is: 100663317
Run 'net-dvs -l' to get PortNum:100663317 info: port hb-f5######-####-####-####-#########024: com.vmware.common.port.volatile.status = inUse linkUp portID=100663317 propType = RUNTIME com.vmware.common.port.block = false , – To check this property to see if host VIF is block or not
Run 'ps | grep nsx-cfgagent' to check cfgAgent process. If nsx-cfgagent has stopped, restart nsx-cfgagent with command '/etc/init.d/nsx-cfgagent restart'.