VIP/ILB showing as disconnected from UI on all nodes except primary in Aria Operations for logs
search cancel

VIP/ILB showing as disconnected from UI on all nodes except primary in Aria Operations for logs

book

Article ID: 405746

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

  • VIP/ILB showing as Unavailable from UI on all nodes except primary.
  • System monitor page can't load from any node except primary.
  • On the storage/core/loginsight/var/runtime.log for primary node these entries will be found
    [####-##-## 08:10:16.648+0000] ["#######################-######-#"/###.###.###.### WARN] [com.vmware.loginsight.analytics.distributed.SearchResponseAggregator] [Worker failed to return result for query, worker: StrataNodeInfo(host:###.###.###.###, port:16520, token:########-####-####-####-## , status:CONNECTED), query: ]
    [####-##-## 08:10:16.648+0000] ["#######################-######-#"/###.###.###.### ERROR] [com.vmware.loginsight.analytics.distributed.AbstractSearchResponseAggregator] [Partial results are being returned as 1 out of 3 nodes failed to respond. Request token . Please contact an admin user for more information.]
    [####-##-## 08:10:16.650+0000] ["#######################-######-#"/###.###.###.### INFO] [com.vmware.loginsight.analytics.ErrorUtils] [mark error code=5 (Partial results are being returned as 1 out of 3 nodes failed to respond. Request token . Please contact an admin user for more information.)]

  • On the storage/core/loginsight/var/runtime.log for worker node these entries will be found
    [####-##-## 07:04:10.609+0000] ["#################################-######-#"/###.###.###.### WARN] [com.vmware.loginsight.loadbalancer.LoadBalancerEmbeddedService] [Error trying to read cluster node states [510 suppressed]]
    com.vmware.loginsight.cluster.ClusterStateReaderException: Not yet initialized or master unavailable
            at com.vmware.loginsight.cluster.DaemonClusterNodesMaintenanceStatusReader.getClusterNodesMaintenanceStatus(DaemonClusterNodesMaintenanceStatusReader.java:39) ~[#######-###.jar:?]
            at com.vmware.loginsight.loadbalancer.LoadBalancerEmbeddedService.updateLoadBalancerState(LoadBalancerEmbeddedService.java:579) [####-########-#######.jar:?]
            at com.vmware.loginsight.loadbalancer.LoadBalancerEmbeddedService$3.run(LoadBalancerEmbeddedService.java:318) [####-########-#######.jar:?]
            at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
            at java.util.concurrent.FutureTask.runAndReset(Unknown Source) [?:?]
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
            at java.lang.Thread.run(Unknown Source) [?:?]
  • Checking the /storage/core/loginsight/config/loginsight-config.xml file we find that all nodes have the primary host added with FQDN instead of IP address and worker nodes are added with IP addresses.
     <distributed #########-########="true">
        <daemon host="primary_node_fqdn" port="16520" token="########-####-####-####-############">
          <#######-##### name="standalone" />
        </daemon>
        <daemon host="###.###.###.###" port="16520" token="########-####-####-####-############">
          <#######-##### name="workernode" />
        </daemon>
        <daemon host="###.###.###.###" port="16520" token="########-####-####-####-############">
          <#######-##### name="workernode" />
        </daemon>
      </distributed>

Environment

Aria Operations for logs 8.x

Cause

The worker nodes are unable to make RPC calls to the primary node due to DNS resolution issues.

Resolution

To mitigate the DNS resolution issues, you can update the /storage/core/loginsight/config/loginsight-config.xml file to use the IP address of the Primary node instead of its FQDN. Please follow the steps below carefully:

1. SSH into the Primary Node
Use the root account to log in and navigate to the configuration directory:

cd /storage/core/loginsight/config

2. Identify the Config File with the Highest Version Number

Run the following command and note the file with the highest <xx> value:

ls -l
cat loginsight-config.xml#<xx>

3. Create a Backup of the Config File
It’s important to back up the original before making any changes:

cp loginsight-config.xml#<xx> loginsight-config.xml#<xx>.backup

4. Edit the Config File
Open the config file using vi and update the daemon host from FQDN to the IP address of the Primary node:

vi loginsight-config.xml#<xx>

Replace:
daemon host='FQDN'
With:
daemon host='IP_ADDRESS'
Save and Exit the File

5. In vi, press Esc and type:

:wq!

6. Restart the Log Insight Service. Apply the changes by restarting the service:

service loginsight restart

Note:
Ensure that the updated loginsight-config.xml file is correctly replicated on all worker nodes. If the changes are not reflected automatically, please repeat the above steps on each worker node.

Additional Information

If the operation is successful delete the .backup file we created in step 3 above.