remote_logging_server_error alarm keeps ON and OFF even if logs can be redirect to remote logging-server as expected
search cancel

remote_logging_server_error alarm keeps ON and OFF even if logs can be redirect to remote logging-server as expected

book

Article ID: 423370

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

It is observed in /var/log/syslog that there is an alarm the "remote_logging_server_error" event type.

<DATE_TIME> <HOSTNAME> NSX 3313 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="CRITICAL" eventFeatureName="audit_log_health" eventType="remote_logging_server_error" eventSev="critical" eventState="On" entId="<UUID>"] Log messages to logging server <IP_ADDRESS>@<PORT>@UDP (<UUID>) cannot be delivered possibly due to an unresolvable FQDN, an invalid TLS certificate or missing NSX appliance iptables rule.

 

And it is also seen in /var/log/syslog that there are TimeoutExpired and CalledProcessError exceptions while running iptables command.

<DATE_TIME> <HOSTNAME> NSX 3340 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="INFO" s2comp="fork-executor-0"] Exception caught when running cmd '{'cmd': ['sudo', 'iptables', '-w', '5', '-C', 'OUTPUT', '-d', '<IP_ADDRESS>', '-p', 'udp', '-m', 'udp', '--dport', '<PORT>', '-m', 'owner', '--uid-owner', 'syslog', '-j', 'ACCEPT'], 'input': None, 'shell': False, 'timeout': 4, 'check_return': True, 'env': None, 'type': 0, 'proc_tree': False, 'text': True, 'timestamp': 2302528.71475349, 'seq': 713734}': {'seq': 713734, 'type': 0, 'executor': 0, 'timestamp': 2302528.714914041, 'execute_time': 4.027204934041947, 'exception': TimeoutExpired(['sudo', 'iptables', '-w', '5', '-C', 'OUTPUT', '-d', '<IP_ADDRESS>', '-p', 'udp', '-m', 'udp', '--dport', '<PORT>', '-m', 'owner', '--uid-owner', 'syslog', '-j', 'ACCEPT'], 4)}

<DATE_TIME> <HOSTNAME> NSX 3340 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="INFO" s2comp="fork-executor-1"] Exception caught when running cmd '{'cmd': ['sudo', 'iptables', '-C', 'OUTPUT', '-w', '5', '-o', 'eth0', '-p', 'udp', '-m', 'udp', '--dport', '<PORT>', '-m', 'owner', '--uid-owner', 'syslog', '-j', 'ACCEPT'], 'input': None, 'shell': False, 'timeout': 4, 'check_return': True, 'env': None, 'type': 0, 'proc_tree': False, 'text': True, 'timestamp': 2302532.751746835, 'seq': 713737}': {'seq': 713737, 'type': 0, 'executor': 1, 'timestamp': 2302532.751884697, 'exception': CalledProcessError(1, ['sudo', 'iptables', '-C', 'OUTPUT', '-w', '5', '-o', 'eth0', '-p', 'udp', '-m', 'udp', '--dport', '<PORT>', '-m', 'owner', '--uid-owner', 'syslog', '-j', 'ACCEPT'])}

 

However, syslog server is configured properly and verifying iptables rules for all logging servers works well.

edge> get logging-servers
<IP_ADDRESS>:<PORT> proto udp level info exporter_name <UUID>

edge> verify logging-servers
Logging servers verified

 

Environment

VMware NSX

Cause

It could be triggered due to a race condition when TimeoutExpired exception happens.

Resolution

As a workaround:

  • You can consider to acknowledge or even disable this alarm to avoid seeing remote_logging_server_error on and off as the alarm is false-positive.

The fix will be addressed in a future release.