Repeated alarms about "Remote Logging Server Error" on NSX UI
search cancel

Repeated alarms about "Remote Logging Server Error" on NSX UI

book

Article ID: 402651

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Alarms are seen on NSX UI with Event type "Remote Logging Server Error" with description "Log messages to logging server <FQDN>@<Port Number>@TCP cannot be delivered possibly due to an due to an unresolvable FQDN , an invalid TLS certificate or missing NSX appliance iptables rule."
  • NSX version is 4.2.1.2 or 4.2.1.3
  • Alarms repeatedly get triggered and get resolved in the NSX Manager UI.
  • It may seem the alarms and errors started showing after an NSX upgrade. This is because the NSX version 3.x and 4.1.x versions are not impacted by this problem, and only may hit the bug after upgrading to 4.2.1.x or later versions.
  •  
  • The error can be seen in /var/log/syslog.log too-
    • /var/log/syslog:2025-06-18 ######## MONITORING [nsx@### alarmState="OPEN" comp="nsx-manager" errorCode="MP701099" eventFeatureName="audit_log_health" eventSev="CRITICAL" eventState="On" eventType="remote_logging_server_error" level="FATAL" nodeId="#####" subcomp="monitoring"] Log messages to logging server #########@######@UDP cannot be delivered possibly due to an unresolvable FQDN, an invalid TLS certificate or missing NSX appliance iptables rule.
  • Clearing of the message can also be seen in the logs: 
    • 2025-06-18 ######## [nsx@#### comp="nsx-manager" subcomp="nsx-sha" level="CRITICAL" eventFeatureName="audit_log_health" eventType="remote_logging_server_error" eventSev="critical" eventState="Off" Configuration for logging server #########@#####@UDP appear correct.
  • It is found that FQDN of the logging-server is resolvable using getent, nslookup, ping or dig commands. The port numbers are confirmed to be accessible from NSX nodes. But the alarms are still generated. 

Environment

VMware NSX 4.2.x

Cause

This is a known issue impacting VMware NSX. If corresponding IP address of the logging server's FQDN changes, this results in NSX nodes incorrectly using the previous IP in the nodes' iptables rules.

Resolution

Workaround

  1. Manually cleanup the logging server :
    If CLI was initially used to configure the logging-server, it is recommended to use CLI to remove the logging-server (with admin user at NSX Manager):

    del logging-server <hostname-or-ip-address[:port]

    You can verify whether logging server got removed by the below command:

    nsx> get logging-servers

    If Node Profile (a.k.a Central Node Config) was used to add the logging-server, it can be removed from Node Profile at NSX UI.
    Go to System > Fabric > Profiles > Node Profiles and then remove the logging-server from this list. It is required to remove this just once.

  2. SSH to Manager node with root user. Go to /config/vmware/nsx-node-api/syslog/ directory and check if there is a file named exporter_firewall_rules file. If this file exists, then delete this file by using command:

    rm -rf /config/vmware/nsx-node-api/syslog/exporter_firewall_rules

    The above step needs to be done at all 3 Manager nodes.

  3. Confirm the port number NSX Manager uses to connect to the logging-server.  This can also be found at the alarm. For example, if the alarm is "eventType="remote_logging_server_error" eventSev="critical" eventState="On" Log messages to logging server <FQDN>@99999@TCP cannot be delivered possibly due to an unresolvable FQDN, an invalid TLS certificate or missing NSX appliance iptables rule" , then the port number is 99999. [Here 99999 is a random port number used as an example ]

  4. The below steps need to be executed at all the 3 Manager nodes. Use the port number found at step 3 to manually remove any iptable rules for the logging server -

    Get the iptable rule's linenumber:

    iptables -L OUTPUT --line-numbers | grep 99999 

    Here is an example output:
    1    ACCEPT     udp  --  anywhere             <fqdn>  udp dpt:99999 owner UID match syslog

    This means this iptable rule's linenumber is 1

    [Here 99999 is a random port number used as an example ]

    Delete it by using below command

    iptables -D OUTPUT 1

    Re-check the line number by using "iptables -L OUTPUT --line-numbers | grep 99999 " and then do the deletion if there are multiple 99999 entries. 

    Caution: The sequence may change everytime a line is deleted. So deleting multiple lines together may result in deleting the wrong line. Remember to check "iptables -L OUTPUT --line-numbers | grep 99999 " every time and then perform the next deletion.

  5.  Add the logging-server again using "set logging-server" command following this official guide. Please add the logging server by specifying an IP address, instead of FQDN. Specifying a logging server using a FQDN may retrigger the issue.
    Node Profile on NSX UI can be used to add the logging-server too.