After NSX-T upgrade VMs are not added to NSGroups
search cancel

After NSX-T upgrade VMs are not added to NSGroups

book

Article ID: 317677

calendar_today

Updated On:

Products

VMware vDefend Firewall

Issue/Introduction

Symptoms:

  • You have recently upgraded to NSX-T 4.x or NSX-T 3.2.1
  • The Distributed Firewall (DFW), which utilizes Network Security Groups (NSGroups), incorrectly drops traffic because VMs are not added in NSGroups or are missing from the NSGroups following the upgrade
  • Customer traffic is dropped by DFW incorrectly
  • On the NSX-T Manager in /var/log/cloudnet/nsx-ccp*, the following exception is logged when dynamic NSGroup membership is evaluated:
2022-08-11T13:57:48.652Z WARN DynamicGrouping1 CQEngineGroupEvaluatorImpl 15723 - [nsx@6876 comp="nsx-controller" level="WARNING" subcomp="DynamicGrouping"] Get exception null while updating inventory data.
java.lang.NullPointerException: null

        at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) ~[?:1.8.0_332]

 
  • An alternative way to identify if there are any logical switch ports without a logicalSwitchId: 
Run the following command in a manager shell:
# /opt/vmware/bin/corfu_tool_runner.py -o showTable -n nsx -t LogicalSwitch > logicalSwitchDump1.txt
 
 
  • Look at the payload of each logical switch port and find out if there is a logicalSwitchId.
Below is a logical switch port with logicalSwitchId:
Payload:
{
  "managedResource": {
    "displayName": "ubuntu18-04.vmx@########-####-####-####-########3116",
    "intentPath": "/infra/segments/vlan-201/ports/default:########-####-####-####-########0a3c"
  },
  "logicalSwitchId": { <------ here
    "uuid": {
      "left": "############343753226",
      "right": "############278890176"
    }
  },

  "transportZoneId": {
    "uuid": {
      "left": "###########792687143",
      "right": "############78581726"
    }
  },
  "attachmentId": "########-####-####-####-########3116",
  "pendingConfigFromHostd": true,
  "ephemeral": true,
  "switchMode": "STANDARD",
  "logicalPortState": "LOGICAL_PORT_STATE_UP",
  "attachmentType": "ATTACHMENT_TYPE_VIF",
  "transportZoneType": "TRANSPORT_ZONE_TYPE_OVERLAY"
}
 
Below is a logical switch port without logicalSwitchId
Payload:
{
  "managedResource": {
    "displayName": "ubuntu18-04.vmx@########-####-####-####-########3116",
    "intentPath": "/infra/segments/vlan-201/ports/default:########-####-####-####-########0a3c"
  },
  "transportZoneId": {
    "uuid": {
      "left": "############792687143",
      "right": "############478581726"
    }
  },
  "attachmentId": "########-####-####-####-########3116",
  "pendingConfigFromHostd": true,
  "ephemeral": true,
  "switchMode": "STANDARD",
  "logicalPortState": "LOGICAL_PORT_STATE_UP",
  "attachmentType": "ATTACHMENT_TYPE_VIF",
  "transportZoneType": "TRANSPORT_ZONE_TYPE_OVERLAY"
}



Environment

VMware NSX-T

Cause

In NSX-T 3.2.0, the logical switch ID (logicalSwitchId) for logical switch ports can be null.
After an upgrade to 3.2.1 or 4.0.0.1, logicalSwitchId cannot be null.
Any VMs connecting to these logical switch ports, whose logicalSwitchId is null, are not added in NSgroups, which is used in DFW.

Resolution

This issue is resolved by upgrading to NSX-T 4.0.1.

Workaround:
There are three possible workarounds for this issue:
  1. Delete the problematic groups and add them back.
  2. create a new group
  3. restart CCP on all managers