It has been observed, especially after upgrading CA PAM to versions 4.0.X and 4.1.X that there are frequent disconnections taking place
The session logs often contain lines like the following
PAM-UPD-0014: Primary network storage for session recording is down
followed a few seconds later with a message indicating the storage is back up
Looking at the system logs, in kernel.log (accessible by Broadcom Support only) there are frequent error messages like
Jun 18 08:17:43 XXXXX kernel: nfs: server 1.2.3.4 not responding, timed out
However apparently there is no error in the NFS side
Releases 4.0.X and 4.1.X
The root cause for this problem is unclear and likely environmental. However, PAM will negotiate when establishing the NFS mount, what NFS version to use
If it can, PAM will attempt to mount the remote system by using NFS version 4, which works differently from lower versions
Please see
https://datatracker.ietf.org/doc/html/rfc7530
for the specification in version 4, and
https://datatracker.ietf.org/doc/html/rfc1813
for the version 3 specification
Sometimes this causes stability problems (for example stale client id) or any other factor linked to the NFS version.
Another thing to consider is that by default the nfs timeout set in versions 4.0.X and 4.1.X set 60 tenths of a second which is way too low and very different from the default 600 tenths of a second.
Broadcom Support can help testing if this is a problem of NFS side by mounting manually the nfs folder or tuning its options inside the appliances. That will require ssh access and opening a ticket in support.
Other than that, the simplest way to test this is to request the NFS server administrator to set the version supported in it to maximum version 3. In this way, when PAM attempts to mount the remote share, it will be forced to do so in version 3
The nfs timeout should also be modified to version 3