QuorumPeerMain javaservice not running in platform node for VCF Operations for Networks
search cancel

QuorumPeerMain javaservice not running in platform node for VCF Operations for Networks

book

Article ID: 439406

calendar_today

Updated On:

Products

VCF Operations for Networks

Issue/Introduction

The QuorumPeerMain javaservice service fails to start on Platform Node 3 in a multi-node cluster.

Upon execution of command ./run_all.sh df -hsudo /home/ubuntu/check-service-health.sh -p -d you see that all other services are running and healthy, as seen in the example below:

--platform3--
ElasticSearch is running and healthy.
ElasticSearch statistics:
Uptime:5-21:26:25
HRegionServer is running and healthy.
Uptime:19:15
Kafka is running and healthy.
Kafka statistics:
Uptime:5-21:25:58
NodeManager is running and healthy.
Uptime:5-21:26:19
SaasListener is running
Uptime:22:30:40
Restapilayer is running and healthy.
Restapilayer statistics:
Uptime:22:30:50
TSDB is running
TSDB statistics:
Uptime:22:30:51
DataNode is running and healthy.
Uptime:5-21:26:22
Launcher is running
Uptime:22:30:34
VIPService is running and healthy.
VIPService statistics:
Uptime:5-21:24:09
DatabusGateway is running and healthy.
DatabusGateway statistics:
Uptime:22:28:31
FlinkContainer is running and healthy.
FlinkContainer statistics:
Uptime:04:47
04:46
HMaster is running and healthy.
Is Master:False
Uptime:18:54
Problem: QuorumPeerMain javaservice is not running.
JournalNode is running and healthy.
Uptime:5-21:26:49
Nginx is running and healthy.
Nginx statistics:
Uptime:5-21:27:11
ExpressJSApp is running
Uptime:5-21:27:11
NTPSEC is running and healthy.
Uptime:5-21:27:10
FoundationDB is running and healthy.
FoundationDB statistics:
Uptime:22:28:58
22:28:58

Reviewing the zookeeper-platform3.log reveals the following error: Unable to load database on disk The logs also indicate an epoch time mismatch, as seen in the example below:

2026-04-30 11:44:00,253 [myid:3] - ERROR [main:QuorumPeer@940] - Unable to load database on disk
java.io.IOException: The current epoch, 3dd, is older than the last zxid, 4252017623322
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:922)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:890)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205)
zookeeper-zookeeper-server-platform3.log

Environment

VCF Operations for Networks 6.14.0

Cause

ZooKeeper metadata and snapshot corruption occurred within the /var/lib/zookeeper/version-2 directory on the affected node. This was caused by an improper manual cluster reboot performed in reverse order or an abrupt and ungraceful shutdown of the VM.

Resolution

This is a known issue that requires intervention under the guidance of Broadcom Support.

If this issue is encountered, open a support case with Broadcom Support and refer to this KB article. For more information, see Creating and managing Broadcom support cases.