Default Edge FW Inconsistency: Some API-Created T1 LRs Missing in Manager ViewOverview of the Issue
When creating Tier-1 (T1) Logical Routers (LRs) in bulk using APIs, a discrepancy has been observed where a subset of the newly created T1 LRs are missing their default Edge Firewall (FW) section and default rule in the Manager View. Crucially, the Policy View correctly reflects that all default Edge FW sections and rules have been created for all LRs, indicating an issue in the communication or realization process between the Policy Plane and the Management Plane (MP).Symptoms
The issue is visible only in one of the UI views:
1. Policy Intent is Created:
The Policy log (/var/log/policy/) confirms that the policy intent for both LRs was created and a callback was received to handle the Edge FW changes for both LRs:
2020-07-15T15:09:30.025Z INFO providerTaskExecutor-36 EdgeFirewallProviderNsxT - POLICY [...] Received callback to handle Edge FW changes with pathMap: {CommunicationEntry=[.../2xxxxx-xxxx-xxxx-xxxx-xxxxxx-tier1-default_blacklist_rule, .../xxxxxx-xxx-xxxx-xxxx-xxxxxx-tier1-default_blacklist_rule]}...
(The log contains rules for both the FAILED LR xxxxx-xxxx-xxxx-xxxx-xxxxxx and the SUCCESSFUL LR xxxxx-xxxx-xxxx-xxxx-xxxxxx).
2. The EdgeFirewallHelper Skip:
Further inspection revealed that the EdgeFirewallHelper incorrectly collected only the configuration for the successful LR (xxxxxx-xxxx-xxxx-xxxx-xxxxxx), ignoring the defective one.
2020-07-15T15:09:30.106Z INFO providerTaskExecutor-36 AuditingServiceImpl - - [nsx@6876 audit="true" comp="nsx-manager" level="INFO" reqId="xxxxx-xxxx-xxxx-xxxx-xxxxxx" subcomp="policy"] UserName="system", ModuleName="PolicyEdgeFirewall", Operation="ViewTier1GatewayFirewall", Operation status="success", New value=["xxxxx-xxxx-xxxx-xxxx-xxxxxx"]
3. Management Plane (MP) Confirmation:
The Management Plane log (/var/log/proton/) confirms that it only received the configuration for the successful LR, proving the defective section was never sent.
2020-07-15T15:09:32.316Z INFO http-nio-127.0.0.1-7440-exec-4549 FirewallFacadePatchValidatorImpl - FIREWALL [...] Skipping mandatory section validation for LR section FirewallSectionWithRulesDto{rules='[FirewallRulePatchDto{ruleId='7262', ruleBody='FirewallRuleDto{...}', category='Default', super{FirewallSectionDto{...}}}}}
The appliedTos within the MP log explicitly lists the targetId of the successful LR only:
...appliedTos='\[ResourceReference{targetId='xxxxx-xxxx-xxxx-xxxx-xxxxxx', targetDisplayName='null', targetType='LogicalRouter', isValid='null'}\]'...
NSX 3.0.0/3.0.1
The root cause of the missing default Edge FW in the Manager View is a bug in the Policy Plane's EdgeFirewallHelper component.
When multiple Logical Router default firewall sections are batched for realization, the helper fails to collect all necessary LRs, causing the configuration for the skipped LR to be ignored and subsequently never sent to or realized on the Management Plane.
This results in the Manager View's data being incomplete, while the Policy View (which tracks the intent) remains accurate.
Issue is Fixed in 3.0.2.
To Workaround the issue Add delay of 1-2 seconds between API calls