From the uploaded eventlog, the following would be seen.
2022-06-18 11:33:12-00:00UTC "Group cache: Lookup failed for group xxxxxxxxx. The group may not exist in AD, or the domain may be offline. Error 0x25020A(2425354) (symbol: '<null>')" 0 250035:96 lw_group_cache.cpp:464
The "The group may not exist in AD, or the domain may be offline. Error 0x25020A(2425354) (symbol: '<null>')" 0 250035:96 lw_group_cache.cpp:464" error seen above is the primary authentication error seen in the logs and is the core trigger for the internet service outage.
This is caused by CNAME changes made in the customer's environment, which would in turn break the trust relationship with the Proxy.
There is a known issue with some Kerberos clients (Internet Explorer 7 and 8 included) that attempt to authenticate with Kerberos-enabled services that are configured to resolve using DNS CNAMEs instead of A Records. The root of the problem is that the client does not correctly form the SPN in the TGS request by creating it using the hostname (A Record) instead of the alias name (CNAME).
Example:
A Record: xxxx.<my_y_domain>
CNAME: my_y_domain (aliases xxxx.<my_y_domain>)
If the client attempts to authenticate with http://uuuuu.<my_y_domain>, the client does not correctly form the SPN and requests a Kerberos ticket for http/xxxx.<my_y_domain> instead of http://uuuuu.http://uuuuu.<my_y_domain>
The further fallout of the challenge with the CNAME change was the ""Authentication failed with 41887 (0x0000A39F) (symbol: 'Cannot find KDC for requested realm'): user xxxxxxxxx' (domain xxxxxxxxx) - user considered 'unknown'" 0 250017:96 lw_schannel.cpp:609" error, as seen in the log excerpts below. This was the root cause of the break in the trust relationship with ProxySG.
2022-06-18 11:33:16-00:00UTC "Authentication failed with 41887 (0x0000A39F) (symbol: 'Cannot find KDC for requested realm'): user 'xxxxxxxxx' (domain xxxxxxxxx) - user considered 'unknown'" 0 250017:96 lw_schannel.cpp:609
2022-06-18 11:33:16-00:00UTC "Authentication failed with 41887 (0x0000A39F) (symbol: 'Cannot find KDC for requested realm'): user 'xxxxxxxxx' (domain xxxxxxxxx) - user considered 'unknown'" 0 250017:96 lw_schannel.cpp:609
2022-06-18 11:33:16-00:00UTC "Authentication failed with 41887 (0x0000A39F) (symbol: 'Cannot find KDC for requested realm'): user 'xxxxxxxxx' (domain xxxxxxxxx) - user considered 'unknown'" 0 250017:96 lw_schannel.cpp:609
2022-06-18 11:33:17-00:00UTC "Authentication failed from xxxxxxxxx: user '\' (realm xxxxxxxxx)" 0 250017:96 authutility.cpp:113
Note: The KDC for a domain is located on a domain controller, as is the Active Directory for the domain. Both services are started automatically by the domain controller's Local Security Authority (LSA) and run as part of the LSA's process. The Kerberos Key Distribution Center (KDC) is integrated with other Windows Server security services that run on the domain controller. The KDC uses the domain's Active Directory Domain Services database as its security account database. Active Directory Domain Services is required for default Kerberos implementations within the domain or forest.
With the trust relationship already broken because of issues around the CNAME change, and with the issues also already resolved in the customer's environment, The ProxySG needed to be manually rejoined back to the domain. To have this done successfully, we needed to complete the following tasks:
Ref. doc.:
How do I check for cached object and delete them selectively?
For rejoining the ProxySG back to the Windows domain, guidance from the Tech. Article with the URL below was followed.
Steps to join a Windows Domain
After rejoining the appliance back to the domain, the web (internet) service would be restored, fully resolving the reported issue.