We see a flood of PAM-CM-4096 messages in the session logs of our primary cluster site PAM servers. This appears to have started after changing the Local PAM Docker CIDR on all cluster nodes following instructions on documentation page Configure Docker Network Settings. A few days after the change our Primary Master became inaccessible, only showing a "CA PAM server is starting up" message when trying to log on, and credential requests were failing. We needed to reboot the server to recover. The reboot stopped the message flood on this node.
Release : 4.1.1-4.1.3
PAM did not restart all affected services after the configuration change. This caused some internal workflows to fail, which in turn caused the PAM-CM-4096 messages and had the potential of causing synchronization failures in a cluster.
PAM Engineering will implement a requirement to reboot the PAM server after making a Local PAM Docker CIDR change. This is expected to be included in PAM releases 4.1.4+.
For now we recommend to reboot each PAM server after making the CIDR change using the Configuration > Power page. This can be done in an active cluster one node at a time. Wait for each node to be back in sync by checking its own cluster status on the Configuration > Clustering page once it's accessible again, and for secondary site nodes also confirm the status on one of the primary site nodes, which show synchronization status for all cluster nodes. Then move on to the next node. Update the group replication leader last to avoid multiple switches of this role.
Another potential problem observed after a Local PAM Docker CIDR change w/o a reboot is discussed in KB 265399.