We have the product API Portal 4.4 deployed in k8s with helm. After some days, the container zookeeper-0 is failed (CrashLoopBackOff) and it cannot start due a data error, if zookeeper not start, the containers middlemanager-0, kafka-0, coordinator-0 and broker-0 fails:
Memory limits: min=256m, max=512m
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
2020-03-23 12:17:40,829 [myid:] - INFO [main:[email protected]] - Reading configuration from: /opt/zookeeper/bin/../conf/zoo.cfg
2020-03-23 12:17:40,840 [myid:] - INFO [main:[email protected]] - Resolved hostname: zookeeper-0.zookeeper.dev-portal.svc.cluster.local to address: zookeeper-0.zookeeper.dev-portal.svc.cluster.local/10.60.2.12
2020-03-23 12:17:40,841 [myid:] - ERROR [main:[email protected]] - Invalid configuration, only one server specified (ignoring)
2020-03-23 12:17:40,842 [myid:] - INFO [main:[email protected]] - autopurge.snapRetainCount set to 3
2020-03-23 12:17:40,842 [myid:] - INFO [main:[email protected]] - autopurge.purgeInterval set to 0
2020-03-23 12:17:40,842 [myid:] - INFO [main:[email protected]] - Purge task is not scheduled.
2020-03-23 12:17:40,842 [myid:] - WARN [main:[email protected]] - Either no config or no quorum defined in config, running in standalone mode
2020-03-23 12:17:40,853 [myid:] - INFO [main:[email protected]] - Reading configuration from: /opt/zookeeper/bin/../conf/zoo.cfg
2020-03-23 12:17:40,854 [myid:] - INFO [main:[email protected]] - Resolved hostname: zookeeper-0.zookeeper.dev-portal.svc.cluster.local to address: zookeeper-0.zookeeper.dev-portal.svc.cluster.local/10.60.2.12
2020-03-23 12:17:40,854 [myid:] - ERROR [main:[email protected]] - Invalid configuration, only one server specified (ignoring)
2020-03-23 12:17:40,854 [myid:] - INFO [main:[email protected]] - Starting server
2020-03-23 12:17:40,859 [myid:] - INFO [main:[email protected]] - Server environment:zookeeper.version=3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT
2020-03-23 12:17:40,859 [myid:] - INFO [main:[email protected]] - Server environment:host.name=zookeeper-0.zookeeper.dev-portal.svc.cluster.local
2020-03-23 12:17:40,859 [myid:] - INFO [main:[email protected]] - Server environment:java.version=1.8.0_212
2020-03-23 12:17:40,859 [myid:] - INFO [main:[email protected]] - Server environment:java.vendor=IcedTea
2020-03-23 12:17:40,859 [myid:] - INFO [main:[email protected]] - Server environment:java.home=/usr/lib/jvm/java-1.8-openjdk/jre
2020-03-23 12:17:40,860 [myid:] - INFO [main:[email protected]] - Server environment:java.class.path=/opt/zookeeper/bin/../zookeeper-server/target/classes:/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../zookeeper-server/target/lib/*.jar:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.7.25.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.7.25.jar:/opt/zookeeper/bin/../lib/netty-3.10.6.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.17.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../lib/audience-annotations-0.5.0.jar:/opt/zookeeper/bin/../zookeeper-3.4.14.jar:/opt/zookeeper/bin/../zookeeper-server/src/main/resources/lib/*.jar:/opt/zookeeper/bin/../conf:
2020-03-23 12:17:40,860 [myid:] - INFO [main:[email protected]] - Server environment:java.library.path=/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64/server:/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64:/usr/lib/jvm/java-1.8-openjdk/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:java.io.tmpdir=/tmp
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:java.compiler=<NA>
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:os.name=Linux
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:os.arch=amd64
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:os.version=4.14.165-133.209.amzn2.x86_64
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:user.name=1010
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:user.home=/home/1010
2020-03-23 12:17:40,861 [myid:] - INFO [main:[email protected]] - Server environment:user.dir=/opt/zookeeper
2020-03-23 12:17:40,869 [myid:] - INFO [main:[email protected]] - tickTime set to 2000
2020-03-23 12:17:40,869 [myid:] - INFO [main:[email protected]] - minSessionTimeout set to -1
2020-03-23 12:17:40,869 [myid:] - INFO [main:[email protected]] - maxSessionTimeout set to -1
2020-03-23 12:17:40,876 [myid:] - INFO [main:[email protected]] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2020-03-23 12:17:40,880 [myid:] - INFO [main:[email protected]] - binding to port 0.0.0.0/0.0.0.0:2181
2020-03-23 12:17:40,895 [myid:] - INFO [main:[email protected]] - Reading snapshot /opt/zookeeper/data/version-2/snapshot.2c4659
2020-03-23 12:17:41,674 [myid:] - ERROR [main:[email protected]] - Last transaction was partial.
2020-03-23 12:17:41,675 [myid:] - ERROR [main:[email protected]] - Unexpected exception, exiting abnormally
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:66)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:588)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:607)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:573)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:653)
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:219)
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:176)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:217)
at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:284)
at org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:407)
at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:118)
at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:122)
at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:89)
at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:55)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:119)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
We solved the problem adding the command rm -rf /opt/zookeeper/data/version-2 in the container for the first execution, but after some time, the container fails again.
The docker-secret.yaml which we are using is in CA API Developer Portal Solutions & Patches site:
https://ftp.broadcom.com/user/downloads/pub/API_Management/API_Developer_Portal_Enhanced_Experience/CR/helm/docker-secret.yaml
Release : 4.4
Component : API PORTAL