Broadcom API Gateway: After upgrade Gateway won't start, Hazlecast fault stack
search cancel

Broadcom API Gateway: After upgrade Gateway won't start, Hazlecast fault stack

book

Article ID: 186405

calendar_today

Updated On:

Products

CA API Gateway API SECURITY CA API Gateway Enterprise Service Manager (Layer 7) STARTER PACK-7 CA Microgateway

Issue/Introduction

After patching/upgrading the gateway will not start with the following error.

 

**** Unable to start the server: Error starting server : Lifecycle error: Could not initialize Hazelcast cluster

 

2020-03-05T23:21:45.196-0800 SEVERE  99 com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl: [RedactedIP]:8777 [gateway] [3.10.2] Problem while reading DataSerializable, namespace: 0, ID: 0, class: 'com.hazelcast.cluster.impl.operations.JoinCheckOperation', exception: com.hazelcast.cluster.impl.operations.JoinCheckOperation

com.hazelcast.nio.serialization.HazelcastSerializationException: Problem while reading DataSerializable, namespace: 0, ID: 0, class: 'com.hazelcast.cluster.impl.operations.JoinCheckOperation', exception: com.hazelcast.cluster.impl.operations.JoinCheckOperation

        at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.rethrowReadException(DataSerializableSerializer.java:180)

        at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:161)

        at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:105)

        at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:50)

        at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:48)

        at com.hazelcast.internal.serialization.impl.AbstractSerializationService.toObject(AbstractSerializationService.java:187)

        at com.hazelcast.spi.impl.NodeEngineImpl.toObject(NodeEngineImpl.java:322)

        at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:390)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:115)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.run(OperationThread.java:100)

Caused by: java.lang.ClassNotFoundException: com.hazelcast.cluster.impl.operations.JoinCheckOperation

        at com.l7tech.server.policy.module.j.findClass(Unknown Source)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

        at com.l7tech.server.policy.module.j.loadClass(Unknown Source)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

        at com.hazelcast.nio.ClassLoaderUtil.tryLoadClass(ClassLoaderUtil.java:173)

        at com.hazelcast.nio.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:147)

        at com.hazelcast.nio.ClassLoaderUtil.newInstance(ClassLoaderUtil.java:101)

        at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:150)

        ... 8 more

Caused by: java.security.PrivilegedActionException: java.lang.ClassNotFoundException: com.hazelcast.cluster.impl.operations.JoinCheckOperation

        at java.security.AccessController.doPrivileged(Native Method)

        ... 16 more

 

Environment

Release : 11.0+

Component : API GATEWAY

Cause

There is a mismatch in hazlecast versions that will cause startup failures if the ssg process is active on multiple nodes sharing a database.

Resolution

Gateway uses an embedded Hazelcast's library that monitors all nodes's status/availability that are part of the same cluster. 

A CR that includes an update of the Hazelcast library, does lead to a situation where the updated node runs a mismatching library version compared to what is running on the remaining nodes. When that happens, there is a failure in the hazelcast's communication. 

The ways to circumvent this are: 

1) At upgrade, upgrade each node. Not one at a time. When the secondary (and or other processing nodes are stopped) the upgraded host should start fine.
2) Though not recommended you can usually use iptables to block all remote ips for 8777 hazlecast. Do not put a full blockage to 8777 because then the host will not be able to communicate to itself.