Admin Roles are out of sync and not updating correctly between two JBoss Cluster Nodes.
The root of the problem is known issue in the way JBoss handles jgroup caching with a two node cluster.
There is a known JBoss issue where if any of the cluster nodes goes out of sync as a result of a network connectivity problem, JBoss nodes fail to register with each other and throw the following error:
WARN [org.jgroups.protocols.UDP] (JBoss SystemThreads(1)-3) failed to join /on net5:
java.net.SocketException: Unrecognized Windows Sockets error: 0: no Inet4Address associated with interface
WARN [org.jgroups.protocols.TCP] (main) failed to join /on net5: java.net.SocketException: Unrecognized Windows Sockets error: 0: no Inet4Addressassociated with interface
This blocks the delivery of JMS messages, and results in the with Roles no longer reflecting updates made on another cluster node.
The issue will affect more than just Admin Roles because other run-time caches can no longer being synchronized properly (e.g. all Role types), as the underlying JMS Messaging system is no longer communicating between the affected cluster nodes.
The issue is seen in JBoss 5.0/5.1 even with large timeouts set in <FD> tag, and using all possible combinations of the <FD>, <FD_SOCK> and shun configuration parameters.
This issue may be resolved with the combination of JGroups 2.8 (based on https://issues.jboss.org/browse/JGRP-937) and JBoss Messaging 1.4.8.SP8 (which is not compatible with JBoss 5.0/5.1).
Neither of these component versions are available in public or commercial releases of JBoss (per http://www.redhat.com/resourcelibrary/articles/jboss-enterprise-application-platform-components), Red Hat would need to be engaged to confirm and back-port the required fixes to a JBoss version supported by IdenityMinder.
Our official recommendation as a result of this is to engage Red Hat Support directly to obtain a code fix for JBoss 5.0/5.1.
An alternative is to migrate to another Application Server vendor that CA IdentityMinder supports (i.e. Websphere or Weblogic), that can handle the transient network conditions.