Many users report that PAM is very sluggish. PAM administrators notice that every page takes several seconds to load, including pages that loaded very fast in the past. This is observed since the upgrade to 4.1.1.
Another symptom of the problem is that PAM servers show very high CPU usage and user activities including user login or launch of access sessions start to fail.
The problem disappears after a reboot, but comes back within a few days on busy PAM servers and later on less frequently used servers. Steady programmatic access of PAM servers, such as health checks from load balancers, Rest API calls or frequent A2A client activity will also lead to a faster performance degradation.
If the node is not rebooted, it can become completely unresponsive.
Release : 4.1.1, 4.1.2
There was a problem introduced with new feature Enable or Disable TLS Ciphers, see documentation page New Features and Enhancements in 4.1.1. This introduced a memory leak that could lead to longer and longer delays during the handshake for incoming connections. Especially on PAM servers with a memory allocation below the recommended 64GB, the memory leak can cause failures to fork new child processes from the daemon listening on the HTTPS port 443, making PAM completely unresponsive. If this happens on a primary site node, it may still show as active and in sync on the Clustering status page of other nodes, since database synchronization does not depend on HTTPS connections.
For PAM 4.1.1 apply hotfix 4.1.1.08. For PAM 4.1.2 apply hotfix 4.1.2.01.
As a workaround, the high CPU/low performance problem can be avoided by setting option "TLS v1.0/1.1 Connection Allowed" to Disabled on the Configuration > Security > Access page. This setting also affects the LDAP browser and will not allow it to connect to an LDAP domain controller using TLS 1.0 or 1.1. If your LDAP integrations work with TLS 1.2 it should be ok to change the setting.
Note that even with TLS 1.0/1.1 disabled there will be a slow memory leak and the hotfix should be applied anyway at the next opportunity. If the memory leak remains unchecked, the node will become inaccessible eventually (within a few months).