Edge nodes dataplane service is down after exiting maintenance mode when using Service Insertion

search cancel

Edge nodes dataplane service is down after exiting maintenance mode when using Service Insertion

book

Article ID: 322588

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

NSX-T versions 4.X.
North South service insertion is deployed and in use.
The edge node was placed into maintenance mode and then exited.
Management plane connectivity from the NSX-T manager to the edge node is not working.
Traffic flowing through the edge node on logical routers are impacted.
The 'dataplane' service on the edge node is down.

get service dataplane
Service name: dataplane
Service state: stopped

In the NSX-T edge log '/var/log/syslog' we can see the following ERROR:

2023-06-28T15:52:46.966Z nsxedge-21193684-1-2301281756346299142024 NSX 81301 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="dpc-pb" tname="dp-ipc23" level="ERROR" errorCode="EDG0400536"] Service port fcea8386-####-####-####-##########42 not found
2023-06-28T15:52:46.062Z nsxedge-21193684-1-2301281756346299142024 7bb730031118 79853 - - 2023-06-28T15:52:46Z datapathd 81301 dpc-pb tname="dp-ipc23" [ERROR] Service port fcea8386-####-####-####-##########42 not found errorCode="EDG0400536"

Environment

VMware NSX-T Data Center
VMware NSX 4.0.x

Cause

This is occurs when CCP's (Central Control Plane) internal deletion of Service Insertion Redirection Policies is not working properly, leading to stale 'SiRedirectionPolicyMsg' objects on the Edge node.

Resolution

This issue has been resolved in NSX version 4.1.1 and newer versions.

Workaround:

Perform the following steps on the NSX Manager shell as root user, one by one:

Reboot all 3 NSX manager's:
- root@nsx-manager:~# reboot
Once the manager reboot has completed, run the following command as root user to restart the mpa service:
- root@ns-xmanager:~# /etc/init.d/nsx-mpa restart

Note: Before rebooting the NSX manager appliances, please ensure you have up to date backups in place.
Before rebooting the NSX-T manager appliances, ensure the cluster is healthy by running the following command:

get cluster status

If the cluster is up and healthy, proceed to the next NSX-T manager.

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No