Error: "Error authenticating user" intermittent login failures in Aria Operations for Logs
search cancel

Error: "Error authenticating user" intermittent login failures in Aria Operations for Logs

book

Article ID: 432125

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

You encounter random authentication failures when logging into the software, affecting both local and Active Directory (AD) users. The primary error shown in the UI is:

Error authenticating user

Analysis of the runtime.log and packet captures identifies SSL handshake timeouts and Cassandra quorum failures:

  • io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms
  • com.datastax.oss.driver.api.core.servererrors.ReadTimeoutException: Cassandra timeout during read query at consistency QUORUM (2 responses were required but only 1 replica responded)

These symptoms often correlate with the Virtual IP (VIP) moving between different nodes in a cluster.

Environment

 

  • Aria Operations for Logs 8.18.3

  • Cisco UCS Hardware / Cisco ACI

 

Cause

An excessive volume of trace-level logging in the Cassandra Java Virtual Machine (JVM) options causes substantial I/O and processing overhead. Specifically, the promotion*=trace selector floods the gc.log with thousands of lines per second, leading to delayed processing of internode ClientHello and ServerHello packets. This latency prevents the Cassandra database from reaching a quorum during authentication requests.

Additionally, a mismatch in the truststore (cacerts) between nodes or incorrect Cisco ACI endpoint move detection settings can exacerbate these communication failures

Resolution

Follow these steps to reduce Cassandra logging verbosity and ensure internode communication consistency:

1. Revert Cassandra Logging Level

Perform these steps on all nodes in the cluster to restore normal behavior:

  1. Log in to the Aria Operations for Logs node as root via SSH.

  2. Navigate to the Cassandra configuration directory:

    cd /storage/core/loginsight/cidata/cassandra/config/

  3. Edit the cassandra-env.sh file.

  4. Locate the JVM_OPTS section containing the -Xlog:gc configuration.

  5. Change the verbosity selectors from trace or debug to info (specifically removing promotion*=trace and heap*=trace).

  6. Restart the Log Insight services to apply the changes.

     

     

2. Synchronize Truststores

If handshake errors persist, verify truststore consistency:

  1. Compare the truststore file located at /usr/lib/loginsight/application/etc across all nodes.

  2. If a mismatch is found, copy the cacerts file from a known healthy node to the impacted nodes.

     

     

3. Validate Cisco ACI Configuration (If Applicable)

If operating in a Cisco ACI environment, ensure the following:

  1. Add the Aria Operations for Logs node MAC addresses to the Bridge Domain Rogue/Coop Exception List.

  2. Enable GARP-based detection for the Endpoint (EP) Move Detection Mode.

  3. Verify that ARP Flooding is enabled on the bridge domain.