VCF Operations for Logs UI is not available after upgrade to 9.0.2
search cancel

VCF Operations for Logs UI is not available after upgrade to 9.0.2

book

Article ID: 428924

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

The User Interface (UI) is inaccessible after you upgrade to VCF Operations for Logs 9.0.2. You find the Cassandra service fails to start on one or more nodes, resulting in a degraded cluster state.

  • "Page Not Found" or "Service Unavailable" errors occur when you access the UI.
  • systemctl status reports a degraded state:
    State: degraded
  • nodetool-no-pass status shows Cassandra on one or more nodes is down (DN):
    Status=Up/Down|/ State=Normal/Leaving/Joining/Moving
    -- Address Load Tokens Owns (effective) Host ID Rack UN ##.##.##.## 18.32 MiB 256 100.0% [UUID] rack1 DN ##.##.##.## ? 256 100.0% [UUID] rack1 DN ##.##.##.## ? 256 100.0% [UUID] rack1
  • Inventory Sync through Fleet Manager for VCF Operations for logs, the operation fails with the following error:

    Error Code: LCMVRLICONFIG40100

    Operations-logs host is unreachable. Either the host name is incorrect or the virtual machine is not reachable.
    Unable to connect to host. Check host details and retry.

  •  You will see similar exception below in  /storage/var/loginsight/cassandra.log

     ERROR [Messaging-EventLoop-#-#] ####-##-##T##:##,OutboundConnectionInitiator.java:### - Failed to handshake with peer /<VCFOperationsForLogs_WorkerIp>:7000(/<VCFOperationsForLogs_WorkerIp>:7000)
    at io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown

    or 

    ERROR [Messaging-EventLoop-3-3] ####-##-##T##:##:##, InboundConnectionInitiator.java:### - Failed to properly handshake with peer /##.###.##.##:39412. Closing the channel.
    io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors

Environment

VCF Operations for Logs 9.0.2

Cause

This issue occurs due to a keystore and trust store mismatch between the Primary and worker nodes, preventing secure communication between the Cassandra instances.

 

Resolution

To resolve this issue, you must synchronize the certificates across the cluster nodes:

  1. Log in to the primary node via SSH as root.
  2. Run the following command on both the primary and worker nodes to identify the keystore password:

    • pw=$(grep 'syslog-ssl-keystore-password' $(ls -1 /storage/core/loginsight/config/loginsight-config* | tail -n 1) | cut -d\" -f2)

  3. Compare the keystore and truststore results between nodes to verify the mismatch:

    • keytool -list -storetype bcfks -providerpath /usr/lib/loginsight/application/lib/lib/bc-fips-*.jar -provider org.bouncycastle.jcajce.provider.BouncyCastleFipsProvider -storepass $pw -keystore /usr/lib/loginsight/application/etc/3rd_config/keystore.bcfks
    • keytool -list -storetype bcfks -providerpath /usr/lib/loginsight/application/lib/lib/bc-fips-*.jar -provider org.bouncycastle.jcajce.provider.BouncyCastleFipsProvider -storepass $pw -keystore /usr/lib/loginsight/application/etc/truststore.bcfks

  4. Copy the following certificate files from the primary node to each worker node, replacing the existing files:

    • /usr/lib/loginsight/application/etc/3rd_config/keystore.bcfks
    • /usr/lib/loginsight/application/etc/truststore.bcfks
    • /storage/core/loginsight/cidata/cassandra/config/cacert.pem

  5. Restart the Log Insight service on all nodes:

    • systemctl restart loginsight

  6. Verify the UI is accessible and the cluster status shows as "Connected" or "Healthy."