When deleting firewall policies and NSgroups from the Global Manager NSX-UI, they don't completely remove. After the attempting to delete the policy/group it changes to a darker color, shows a mini trashcan, and then can't be modified any more.
Example
NSX 4.1.2.x Federation
Global Manager traceflow cleanup fails, which causes the Async Replicator (AR) channel to stop working.
Symptom found and matched in log
/var/log/gmanager/gmanager.log
2024-10-23T20:15:56.595Z INFO ClientReceiverFlowHandler-1 AbstractSpanCalculationService 5789 POLICY [nsx@6876 comp="global-manager" level="INFO" subcomp="global-manager"] delete /global-infra/traceflow-observations/757f0382-19b0-8f9c-be48-56abb77f544e, span is not persisted, calculated span on the fly []
2024-10-23T20:15:56.596Z INFO ClientReceiverFlowHandler-1 AbstractSpanCalculationService 5789 POLICY [nsx@6876 comp="global-manager" level="INFO" subcomp="global-manager"] cleared all SpanPerThreadSpecifics
2024-10-23T20:15:56.596Z ERROR ClientReceiverFlowHandler-1 WorkQueueProducer 5789 POLICY [nsx@6876 comp="global-manager" errorCode="PM529001" level="ERROR" subcomp="global-manager"] Received exception in transaction interceptor
java.lang.NullPointerException: null
at com.vmware.nsx.management.policy.policyframework.service.ops.traceflow.GmTraceflowListener.changeToOldTraceflowPath(GmTraceflowListener.java:58) ~[libgm-framework-api.jar:?]
at com.vmware.nsx.management.policy.policyframework.service.span.SpanCalculationResultUtils.populateSpanCalculationResultForDeletedResource(SpanCalculationResultUtils.java:104) ~[libgm-common-framework.jar:?]
at com.vmware.nsx.management.policy.policyframework.service.span.AbstractSpanCalculationService.handleCalculateSpanForDelete(AbstractSpanCalculationService.java:219) ~[libgm-common-framework.jar:?]
at com.vmware.nsx.management.policy.policyframework.service.span.AbstractSpanCalculationService.calculateSpan_aroundBody4(AbstractSpanCalculationService.java:129) ~[libgm-common-framework.jar:?]
at com.vmware.nsx.management.policy.policyframework.service.span.AbstractSpanCalculationService$AjcClosure5.run(AbstractSpanCalculationService.java:1) ~[libgm-common-framework.jar:?]
Impact
Communication of the application from Global Manager (GM) to Local Manager (LM) will be stuck
During traceflow cleanup on GM, the validation of the segment port gets identified incorrectly and causes this issue.
Issue is resolved in NSX-T 4.2
If you have the matched NSX-UI issue and the null ERROR for traceflow under cause, please open a support case with VMware by Broadcom and provide this KB article link.
Acronyms used in this KB
NSX-UI = Web Browser User Interface
AR = Async Replicator
GM = Global Manager
LM = Local Manager