Sometimes authentication for all users suddenly fails. You can resolve this problem by pushing policy again or by restarting the BCAAA agent. This problem happens if group policy is used for authentication. If you check the authentication statistics in the ProxySG Management Console, you can see that group information is missing for affected users. You may also see the following messages in the Windows Application/System event logs when this problem occurs:
[5664:2316] AcceptSecurityContext failure, ContextLink=0xE46E78 count=17, detail=1311(There are currently no logon servers available to service the logon request.
); status=-2146893039:0x80090311:No authority could be contacted for authentication.
The session setup to the Windows NT or Windows 2000 Domain Controller \\DCName.domainxxx.com for the domain xxx is not responsive. The current RPC call from Netlogon on \\DCName-xxx to \\DCName-yyy.domainxxx.com has been cancelled.
This problem happens because of a DC availability/connectivity issue from the machine running BCAAA. This problem gets triggered if the machine running BCAAA loses connectivity to the DC and if, during this time, the ProxySG connects to BCAAA. When the ProxySG connects to BCAAA, BCAAA contacts a DC to look up the SID (Security Identifier) for each group-of-interest that is configured in the ProxySG's policy. For scalability/performance reasons, these lookups/RPC calls are only done when the ProxySG initially connects to BCAAA. Also, if BCAAA cannot look up the group SID, BCAAA skips it and moves on to the next group. BCAAA does this because most of the customers may have at least one group in their ProxySG policy that doesn't actually exist in AD. If the ProxySG connects to BCAAA during a DC outage, then all of the group lookups would fail. BCAAA would then fail to report any group memberships for users, and this would make every "group=" policy condition for the realm a miss during policy evaluation (authorization fails). Because BCAAA only does these group lookups when the ProxySG initially connects, this behavior will continue until the ProxySG reconnects at a time when BCAAA is able to contact a DC. In other words, even if the DC comes back online, the user will be still able to do authentication as BCAAA will be able to verify the credentials with the DC now, but it won't have group information associated with that user as group lookup calls earlier did not go through (there won't be any mutex created for any of the group-of-interest with corresponding SID). At this point, if a policy is pushed or BCAAA agent is restarted (ProxySG connects to BCAAA), BCAAA will make the RPC call again for group lookups and authentication and authorization will start working fine again.
The solution here is to find the root cause for the DC connectivity issue in the network and fix the problem. In one of the customer cases, a daily automatic job for MS backup in all the DCs in their network was triggering periodic DC connectivity issues.