nsx-cfgagent.xml
, may result in cfgagent not booting up properly. The implication of modifying different sticky files will lead to different outcomes. Below is one such instance where the cfgagent fails to start properly resulting one or all of the below outcomes.[root@esxcli:/var/core] ls -lh
total 16M
-rwxrwxr-x 1 root sssd 14M Nov 16 04:33 nsx-cfgagent-zdump.000
Check nsx-syslog.log
to see if the APP’s which are enabled in nsx-cfgagent.xml
are properly started or not, if an app is started you see similar logs as below.
2023-12-12T11:34:03.835Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] L2 application starts
2023-12-12T11:34:03.835Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] L3 application starts
2023-12-12T11:34:03.836Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] Config application starts
2023-12-12T11:34:03.839Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] Traceflow application starts
2023-12-12T11:34:03.839Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] BFD application starts
2023-12-12T11:34:05.070Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] DFW application starts
2023-12-12T11:34:05.070Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] LB application starts
2023-12-12T11:34:05.072Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] Service insertion application starts
2023-12-12T11:34:05.074Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] Intrusion Detection Service application starts
2023-12-12T11:34:05.079Z In(182) cfgAgent[681923]: NSX 681923 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="3C13A300" level="info"] Livetrace application starts
In this case only 2 apps got started as shown below. That's the reason cfgAgent did not start properly and core got generated.
syslog.7.gz:2023-11-15T12:08:05.366Z NSX[2108361]: nsx-cfgagent service starts
nsx-syslog.log:2023-11-15T21:35:22.759Z cfgAgent[2449402]: NSX 2449402 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="ABEF3C80" level="info"] L2 application starts
nsx-syslog.log:2023-11-15T21:35:22.759Z cfgAgent[2449402]: NSX 2449402 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="ABEF3C80" level="info"] L3 application starts
nsx-syslog.0.gz:2023-11-15T12:08:05.981Z cfgAgent[2108360]: NSX 2108360 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="6389EC80" level="info"] L2 application starts
nsx-syslog.0.gz:2023-11-15T12:08:05.981Z cfgAgent[2108360]: NSX 2108360 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="6389EC80" level="info"] L3 application starts
nsx-syslog.0.gz:2023-11-15T12:08:21.545Z cfgAgent[2109879]: NSX 2109879 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="65326C80" level="info"] L2 application starts
nsx-syslog.0.gz:2023-11-15T12:08:21.545Z cfgAgent[2109879]: NSX 2109879 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="65326C80" level="info"] L3 application starts
/var/core/
VMware NSX
If the Host Transport Node has modified sticky bit file, when the NSX on the Host Transport Node is updated the UC will not upgrade any modified sticky bit file.
This issue is resolved in VMware NSX 4.2.0
Workaround:
If you have encountered this issue, please contact Broadcom Support.
Versions where this is a known issue: NSX 4.x
Version where this is fixed : NSX 4.2.0 & later.
Impact/Risks:
Upgraded host will be shown as down in the NSX UI.
Other Host Transport Nodes in the Cluster Upgrade will fail as VMs can't be moved back to the upgraded host from the non upgraded hosts.
Host is unusable due to non-availability of cfgagent process.