VIP AuthenticationHub 2.2.0 - OPA service fails to start

Products

Symantec Identity Security Platform - IDSP (formerly VIP Authentication Hub)

Issue/Introduction

The OPA service fails in one environment after deploying with setting "--set ssp.featureFlags.accesscontrol.enabled=true". The same version (2.2.0.1466) was working in that region and is currently working in other regions, and a different development region is working with this setting. This issue persists after after reverting / removing the variable.

OPA pods fail with the following events:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 28m (x126 over 13h) kubelet Pulling image "hubble.example.net:5000/example/linux/ae/vendor/opa:2.2.0.1466"
Warning BackOff 8m2s (x1486 over 13h) kubelet Back-off restarting failed container
Warning Unhealthy 3m3s (x3164 over 13h) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500

Logs print a 'rego_unsafe_var_error':

Defaulted container "ssp-opa" out of: ssp-opa, appd-injector (init)
{"level":"error","msg":"Bundle activation failed: 1 error occurred: ./x6f9bc63dxad6fx419ex9e61xf97c96e61a72auth.rego:231: rego_unsafe_var_error: var resN is unsafe","name":"authz","plugin":"bundle","time":"2024-03-19T14:19:15Z"}
{"level":"error","msg":"Bundle activation failed: 1 error occurred: ./x6f9bc63dxad6fx419ex9e61xf97c96e61a72auth.rego:231: rego_unsafe_var_error: var resN is unsafe","name":"authz","plugin":"bundle","time":"2024-03-19T14:20:15Z"}

Does enabling accesscontrol require other changes to policies?

Also, is it possible there was a bad policy created prior to redeployment, which now blocks service startup? The "rego_unsafe_var_error" started appearing intermittently a few hours before redeploying, mixed into mostly successful logs. However now we are only seeing rego failures in the service logs.

Environment

VIP AuthHub 2.2

Cause

The policy rule contains special characters

Resolution

The error “rego_unsafe_var_error: var resN is unsafe” is reported only in the scenario where there is an issue with the expression. The issue in the expression can be any of the following reasons in 2.2.0 release

user expression written in improper format - (This issue is addressed in 3.1 release)
Not equal (ne) operator with group parameter. (Example: (Clearing) - This issue was addressed in 2.2.2 release

Steps to solve the issue

If you have tenant admin access token then do the following

Get all policies and look for the policies that has expressions in principal.user condition. Check if any of the policies have expressions and fix the expressions if they are not having valid format
If you cannot identify any issues with the expressions then Inactivate the policies with expressions and activate the policy one by one and then see if OPA is complaining any issue while downloading the rego bundle

If you do not have the tenant admin access token then do the following

Get the policy rules that have expressions by running the following DB query

SELECT * FROM iamauth.T_RULE_DETAILS where RULE LIKE 'user == %(%';
You would see the response like below

Check if any of the expressions have invalid format then fix the expression.
If you cannot identify any issues with the expressions then Inactivate the policies using the following SQL query with expressions and activate the policy one by one and then see if OPA is complaining any issue while downloading the rego bundle

UPDATE T_POLICY SET STATUS=2 WHERE POLICY_ID = '<policy_id>'
To get the POLICY_ID associated with the RULE_DETAILS then use the following query

SELECT POLICY_ID FROM T_POLICY_RULE where RULE_ID '<RULE ID FROM T_RULE_DETAILS TABLE>';