Users getting default block page accessing allowed resources after admin pushes WSS policy change via UPE
search cancel

Users getting default block page accessing allowed resources after admin pushes WSS policy change via UPE

book

Article ID: 249276

calendar_today

Updated On:

Products

Cloud Secure Web Gateway - Cloud SWG

Issue/Introduction

Users accessing internet sites via WSS using WSS Agent.

WSS Policy controls access based on user groups, and user group information obtained through the Auth Connector integration with AD (no SAML).

Users reporting access denied messages via the default WSS block page accessing resources they should normally have access to.

No Group information is seen by the proxy - the forensic reports and HTTP access logs show the correct user without any corresponding group.

 

Environment

WSS Administered using UPE.

Auth Connector.

WSS Agents.

Cause

Applying a group update  to the WSS policy via UPE triggered an avalanche of requests into the Auth Connector, and making it unresponsive.

Resolution

Fixed with WSS update August 22 2022.

Additional Information

All access denied errors had no group information (field following user info below)

2022-07-23 09:48:16 "DP1-GINDE11_proxysg3" 212 113.14.61.18 AD\User - policy_denied DENIED "Uncategorized" http://malahide.net/ 403 TCP_DENIED GET image/vnd.microsoft.icon http malahide.net 80 /favicon.ico - ico "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36" 192.168.1.86 991 367 - cas_group - "{ %22expect_sandbox%22: false }" no - - - 0 "client" client_connector "none" "none" 15.18.130.131 "Ireland" - - - - - none - - - - none - - ICAP_NOT_SCANNED - ICAP_NO_MODIFICATION - 15.18.130.131 "Ireland" - "India" 5 - wss-agent architecture=x86_64%20name=Windows%2010%20Enterprise%20version=10.0.18363 7.5.1.16390 113.14.61.18 6b20f234-9e85-4f5d-964e-48c467b1c6ea AINLHCND74806GV - - - - - - - - - - 84845f36694cd07f-000000004a1be1ad-00000000628b5860 168.149.184.45 168.149.184.45 "IN" "India"

Managed to track the beginning of the Access denied messages to a configuration change to the tenant policy, which was specific to the GOIs (Groups of interest that WSS Proxy has rules based on).

Gathering the Auth COnnector debug logs at the time showed 1000s of INSUFFICIENT_MEMORY errors at the same time along with Thread creation errors

2022/07/28 11:47:57.102 [7204] [4912:7204] Thread creation failed.; status=8:0x8:Not enough storage is available to process this command.

 

Tracking a specific user that had logged in at the time and was trying to retrieve group information shows that threads doing the s4u authentication seem to have stopped processing. Here's an example of such a thread checking the users group membership showing a 6.5 minute delay simply trying to read the group info from a token, an event which is usually done in under 10ms. CPU on the Auth Connector was not at 100% at the time the issue happened, although it was much higher than before.

AWS_09_35_am_utc_bcca-19684-220523052908.log 531632 2022/07/23 09:35:18.263 [8868] GroupName='RG-Browsing-Allow-IPTEAM-Exception'
AWS_09_35_am_utc_bcca-19684-220523052908.log 531633 2022/07/23 09:35:18.263 [8868] GroupName='RG-Browsing-Allow-Online Storage and Backup'
AWS_09_35_am_utc_bcca-19684-220523052908.log 531634 2022/07/23 09:35:18.263 [8868] GroupName='RG-Browsing-Allow-Online Storage and Backup-RO'
AWS_09_35_am_utc_bcca-19684-220523052908.log 531635 2022/07/23 09:35:18.263 [8868] GroupName='RG-Browsing-Allow-Proxy Avoidance'
AWS_09_35_am_utc_bcca-19684-220523052908.log 531641 2022/07/23 09:35:18.263 [8868] GroupName='RG-Browsing-Allow-Weapons'
AWS_09_35_am_utc_bcca-19684-220523052908.log 532088 2022/07/23 09:41:40.675 [8868] GroupName='RG-Browsing-Allow-Web-based Chat'
AWS_09_35_am_utc_bcca-19684-220523052908.log 532569 2022/07/23 09:41:40.810 [8868] GroupName='RG-Browsing-Allow-Web-based Email'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533058 2022/07/23 09:42:29.747 [8868] GroupName='RG-Browsing-Deny-Instant Messaging'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533206 2022/07/23 09:42:29.779 [8868] GroupName='RG-Browsing-Deny-Social Networking'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533276 2022/07/23 09:42:29.810 [8868] GroupName='RG-Browsing-Deny-Streaming Media'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533277 2022/07/23 09:42:29.810 [8868] GroupName='Web & Internet Security'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533278 2022/07/23 09:42:29.810 [8868] Group Membership:
AWS_09_35_am_utc_bcca-19684-220523052908.log 533279 2022/07/23 09:42:29.810 [8868] Group no: 0, member: no, invalid group name: '(File'
AWS_09_35_am_utc_bcca-19684-220523052908.log 533280 2022/07/23 09:42:29.810 [8868] Group no: 1, member: no, valid group name: 'AG-Aon-APAC-Townsend Compl'

This thread is just not getting run for 6-7 minutes possibly indicating another thread was locked at the time. 

Tracking the number of login requests at the time, we went from approximately 200-250 user logins per minute to about 15,000 when the change was applied! When making a change to a group in the WSS policy (adding or removing), the WSS cache of group information for the tenant would be removed and all subsequent requests into WSS but undergo a login process via the Auth Connector to retrieve the groups.

The fix was to change the caching mechanism on WSS and avoid overloading the Auth Connector when a group change was performed in the policy.