Troubleshoot why creating service instance is failing with "Unknown CPI error"
search cancel

Troubleshoot why creating service instance is failing with "Unknown CPI error"

book

Article ID: 294315

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

While creating service instance with more than 1 VM plan, such as extra-small, small-footprint, it is throwing below exception:
Error: Unknown CPI error 'Unknown' with message 'VM vm-aaaaaaaaxxxxxx was expected in NSX-T but was not found' in 'set_vm_metadata' CPI method (CPI request ID: 'cpi-716502')


Environment

Product Version: 1.9

Resolution

This error could be caused by an underlying NSXT issue.

We can use following steps to identify if it is NSXT issue:

1. Verify the failing service instance using below command and see if the VMs are assigned to all the AZs equally. 
bosh -d <service-instance> vms

2. Verify TAS CPI logs and see if there is any mention of ports that are being blocked or missing VM. For instance:
/var/vcap/data/packages/vsphere_cpi/50ca35ec1631bcfa54c171e78451b27a104b532d/lib/cloud/vsphere/nsxt_provider.rb:330:in `block in logical_ports'

/var/vcap/data/packages/vsphere_cpi/50ca35ec1631bcfa54c171e78451b27a104b532d/vendor/bundle/ruby/2.4.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:28:in `block in retryer'

/var/vcap/data/packages/vsphere_cpi/50ca35ec1631bcfa54c171e78451b27a104b532d/vendor/bundle/ruby/2.4.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:26:in `loop'

/var/vcap/data/packages/vsphere_cpi/50ca35ec1631bcfa54c171e78451b27a104b532d/vendor/bundle/ruby/2.4.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:26:in `retryer'

/var/vcap/data/packages/vsphere_cpi/50ca35ec1631bcfa54c171e78451b27a104b532d/lib/cloud/vsphere/nsxt_provider.rb:323:in `logical_ports'

3. Verify Sys logs and see if the logical port (mentioned in above step) is available:
syslog.4.gz:<182>1 2020-04-29T17:59:12.171Z ensxtcntrl2 NSX 5553 FABRIC [nsx@6876 comp="nsx-manager" subcomp="manager"] VifMsgHandler.BEGIN: Received VifMsg [460d5dd4-42c5-4c9c-847e-3f987083431b:2985]: "operation: ATTACH_VIF_TO_PORT#012type: REQUEST#012vif_attachment {#012

 vif_uuid: "f73a97d3-2d9b-45d9-82f3-e3f156fdf0f1"#012 logical_switch_uuid: "b13a385a-3777-417a-953e-0754c393b8bb"#012 logical_port_uuid: ""#012 host_id: "071fc450-7802-4e2e-8d0c-fa1f8eff952e"#012 vmx_path: "/vmfs/volumes/vsan:525615850d14dcb0-309dfeb7ce2f66bd/63c0a95e

-8e66-7bf4-6cf1-e4434b183440/vm-b702f94c-7690-456a-891e-523f7ef588a8.vmx"#012 host_operation_id: "ac64cbc-01-01-01-01-c6-9c3a-42-43"#012}#012"

 

4. At this point, we can safely say that this is an NSXT issue. Once NSXT issue is identified, following steps can be taken to rectify the issue:
  • SSH in as root on the affected host and restart the NSXT manager 
    2020/4/29 13:13:40 MST (19:13:40 UTC); Restarted nsx-opsagent on host
    
    rr1esxpl373: /etc/init.d/nsx-opsagent restart
  • Check connectivity to NSXT managers and confirm that it's working
[root@rri1esxpl373:/vmfs/volumes/5d750453-7702cbca-06eb-e4434b181320/log] esxcli network ip connection list | grep 5671

tcp    0   0 10.221.28.26:21600        10.221.28.43:5671  ESTABLISHED 2103844 newreno mpa

tcp    0   0 10.221.28.26:21599        10.221.28.43:5671  ESTABLISHED 2103844 newreno mpa
  • Now, verify the service instance again and see if the VMs are assigned to all the Availability Zones (AZs) equally. 
bosh -d <service-instance> vms