Hazelcast and other pods are crashing, resulting in authentication and token generation failures.
After approximately 20 minutes, the pods recover and return to a running state.
Since the upgrade from version 3.2.2 to 3.4.2, we have observed a gradual increase in memory and CPU usage on the Hazelcast pods.
VIP Authentication Hub 3.4.2
The problem was introduced by the Spring framework upgrade in VIP Authentication Hub.
Engineering has identified the following issue:
When session management is disabled, VIP Authentication Hub continues to create unnecessary set objects in Hazelcast. These objects accumulate over time and generate a large volume of metrics data, causing CPU and memory usage to steadily increase. This behavior is the most likely reason the pod failed to start during the scaling event, due to excessive Hazelcast metrics data.
The fix has been introduced into VIP Authentication Hub 3.4.8 release. To permanently resolve this issue, please upgrade to VIP Authentication Hub 3.4.8 or later.
Workaround:
Setting the following JVM option on the Hazelcast pods mitigates the issue:
-Dmanagement.metrics.enable.cache=false
This issue is observed only when session management is disabled and there is a high volume of session generation in VIP Authentication Hub.