Stale NSX service instances after service deployment removal
search cancel

Stale NSX service instances after service deployment removal

book

Article ID: 372831

calendar_today

Updated On:

Products

VMware vDefend Firewall

Issue/Introduction

You are using NSX with Service Insertion (SI).

After removing SI and deploying a new service, alarms are present for "partner channel down" on some transport nodes.

In the UI under System > Service Deployments > Service Instances, some instances are showing a status of "Not Available":

Using API calls to remove the above instance fails due to child objects.

Environment

VMware NSX Data Center with Service Insertion.

Cause

When uninstalling the service deployment the following Corfu tables still have entries:

corfu_tool_runner.py --tool corfu-browser -o showTable -n nsx -t ServiceInstance

corfu_tool_runner.py --tool corfu-browser -o showTable -n nsx -t InstanceRuntime

corfu_tool_runner.py --tool corfu-browser -o showTable -n nsx -t InstanceEndpoint

corfu_tool_runner.py --tool corfu-browser -o showTable -n nsx -t ServiceDeployment

This is caused by stale references in the NSX database which require removal.

Resolution

To resolve the issue, use the following API calls to delete the failed undeployment:

1: Execute the get API to check instance runtime status. In this example, the deployment status is "UNDEPLOYMENT_FAILED":

GET https://<Manager-IP>/api/v1/serviceinsertion/services/<Service-ID>/service-instances/<Instance-ID>/instance-runtimes

Example output:

{
"results": [
{
"runtime_status": "NOT_AVAILABLE", <<<<<<<<<<<<<<<<<<<<
"unhealthy_reason": "",
"maintenance_mode": "OFF",
"deployment_status": "UNDEPLOYMENT_FAILED", <<<<<<<<<<<<<<<<<<<
"service_instance_id": "xxxxx-xxxx-xxx-xxxx-xxxxxx",
"service_vm_id": "xxxxx-xxxx-xxxx-xxxx-xxxxxx:vm-xxxx123",
<SNIP>
"error_message": "Power off failed for vm vm-xxxx123 in vc xxx-xxxx-xxxx : VC operation failed : An error occurred while communicating with the remote host.",
"resource_type": "InstanceRuntime",
<SNIP>
}
],
"result_count": 1
}

2: Delete the service instance runtimes that are not in a healthy state using the following API:

 POST https://<Manager-IP>/api/v1/serviceinsertion/services/<Service-ID>/service-instances/<Instance-ID>/instance-runtimes?action=delete 

3:After deleting the service instances runtime, the UI should show no stale entries.

If the above steps do not resolve the issue, please execute the following commands from NSX Manager's root mode:

corfu_tool_runner.py -n nsx -o showTable -t ServiceInstance  > /image/ServiceInstance.txt

corfu_tool_runner.py -n nsx -o showTable -t InstanceEndpoint  > /image/InstanceEndpoint.txt

corfu_tool_runner.py -n nsx -o showTable -t InstanceRuntime  > /image/InstanceRuntime.txt

corfu_tool_runner.py -n nsx -o showTable -t ServiceDeployment  > /image/ServiceDeployment.txt

Collect the output, along with a support bundle, and submit a support request. Since the cleanup process involves database modifications, ensure that you have up-to-date backups.