NSX upgrade from previous releases such as 4.0.1 to 4.1.2/ 3.2.3 to 4.1.2 intermittently leads to a few TNs having data collection disabled on it or in some cases, host/clusters not appearing on the NSX UI.
Note: This issue is not seen on upgrades from 4.1.1 to 4.1.2.
This issue is primarily caused by two threads racing to modify the same database object. Sometimes the race conditions subsequently resolve, but most times the retries are exhausted.
Fixed in 4.2.0 - Fix is to make sure the race conditions are completely avoided.
GET https://{{mgr_ip}}/policy/api/v1/infra/sites/napp/registration
PATCH https://{{mgr_ip}}/policy/api/v1/infra/sites/napp/registration/{{cluster-id}}
{
"cluster_id": "exxxxxx2-exxe-yxd0-b5ad-cabxxb48erqwf8",
"is_intelligence_enabled": false, <------------ set to false.
"id": "exxxxxx2-exxe-yxd0-b5ad-cabxxb48erqwf8"
}
PATCH https://{{mgr_ip}}/policy/api/v1/infra/sites/napp/registration/{{cluster-id}}
{
"cluster_id": "exxxxxx2-exxe-yxd0-b5ad-cabxxb48erqwf8",
"is_intelligence_enabled": true, <------------------ set to true.
"id": "exxxxxx2-exxe-yxd0-b5ad-cabxxb48erqwf8"
}
Impact/Risks: No flows are reported from a few hosts. Unable to toggle data collection via UI.