Following informational messages can be seen spamming the nsx-syslog.log file on the affected ESXi Host (log file path: /var/run/log):
2024-06-11T18:13:35.638Z In(182) nsx-exporter[2103496]: NSX 2103496 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2103589" level="INFO"] Caught std exception basic_string::substr: __pos (which is 8) > this->size() (which is 0)
2024-06-11T18:13:35.638Z In(182) nsx-exporter[2103496]: NSX 2103496 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2103589" level="INFO"] Cleaned up active table for filter: nic-25635xxx-eth3123-vmware-sfw.1
2024-06-11T18:13:35.639Z In(182) nsx-exporter[2103496]: NSX 2103496 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2103589" level="INFO"] Caught std exception basic_string::substr: __pos (which is 8) > this->size() (which is 0)
2024-06-11T18:13:35.639Z In(182) nsx-exporter[2103496]: NSX 2103496 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2103589" level="INFO"] Cleaned up active table for filter: nic-25635xxx-eth3131-vmware-sfw.1
No impact observed on data services
ESXi version 8.0.2
NSX version 4.0.1.1
This is an exception that gets triggered because of the way IP Reputation stats handle container interfaces in the dfw exporter - currently, IP reputation stats depend on VM UUID for creating the records and exporting to MP but for container interfaces, VM uuid is empty which leads to this exception.
This issue is resolved in NSX version 4.1.0
Disable the IP reputation feature. There are two default rules configured for this feature and the suggestion is to disable the two malicious rules (Rule id 5 and 6) and that will stop generating the malicious flows which can trigger this crash in dfw-exporter for container use case: