NOTE: Do not perform these operations against VMware Aria Automation (formerly vRealize Automation) refresh tokens.
Symptoms:
1. SSH into all the vIDM Appliances and check for available free space in /db partition using command "df -h”. If Used % shows value indicating the DB partition is reaching nearly out of space proceed with following steps.
2. To find the pgpool master, run the below command on any of the vIDM nodes
su root -c "echo -e 'password'|/usr/local/bin/pcp_watchdog_info -p 9898 -h localhost -U pgpool"
Note: If the file /usr/local/etc/pgpool.pwd is present on any vIDM node, then while executing all the below steps replace the password with the one present in the file.
3. Access the postgres DB using below command on the pgpool master vIDM node
export PGPASSWORD=`cat /usr/local/horizon/conf/db.pwd`
/opt/vmware/vpostgres/current/bin/psql -U postgres saas
4. Find out which OAuth2Client records are used to create Refresh Tokens , using below command
//Find out which OAuth2Clients are used to create refresh tokens, sorting by the number of RefreshToken using descending order
SELECT "idClient", count("idClient") AS counter FROM "OAuth2RefreshToken" GROUP BY "idClient" ORDER BY counter DESC;
Example: With the below output shows idClient listed on left row has generated respective number of OAuth2RefreshToken.
515 | 1079100
564 | 1035615
565 | 896414
621 | 669656
5. To review "authGrantTypes" fetch OAuth2Client.authGrantTypes detail info for each idClient use following commands
//Get OAuth2Client detail info, mainly to find out the "OAuth2Client.authGrandTypes" value of each OAuth2Client record
SELECT DISTINCT(ORT."idClient"), OC.* FROM "OAuth2RefreshToken" ORT, "OAuth2Client" OC WHERE ORT."idClient" = OC."id" ORDER BY ORT."idClient";
Example: Output will be as following showing for idClient 515 displaying "authGrantTypes"
idClient | 515
id | 515
idOrganization | 2
idIdentityProvider |
clientId | {nsx local manager for management domain fqdn FQDN.DOMAIN.LOCAL}.nsx_mgr_oauth_client.XXXXXXXXX
secrt | {secrt *XXXXXXX*}
scope | user
authGrantTypes | authorization_code password refresh_token
authorities |
redirectUri | https://{nsx local manager for management domain fqdn FQDN.DOMAIN.LOCAL}/vidm-oauth2-login
idEncryptionMethod | 2
tokenType | Bearer
length | 32
accessTokenTTL | 15
refreshTokenTTL | 43200
displayUserGrant | f
createdDate | 2023-XX-XX 00:00:58.199
timestamp | XXXXXXXXXXXXXX
idUser |
rememberAs |
strData |
idDevice |
resourceUuid |
templateAppProductId |
internalSystemClient | f
userDeviceId |
inheritanceAllowed | f
refreshTokenIdleTTL |
strConfiguration | {
+
6. Further, decide on which OAuth2Client records need to be updated by removing "refesh_token" value from OAuth2Client.authGrandTypes column.
Note: Not all OAuth2Client records should be updated. Only OAuth2Client records are responsible to create excessive amount of RefreshTokens (like millions of them) need to be updated.
7. To check the authGrantTypes of OAuth2Client records that will be updated:
SELECT id, "authGrantTypes" FROM "OAuth2Client" WHERE id IN (idClientN1, idClientN2) ORDER BY "authGrantTypes" DESC;
SELECT id, "authGrantTypes" FROM "OAuth2Client" WHERE id IN (564,515,565,621) ORDER BY "authGrantTypes" DESC;
id | authGrantTypes
-----+------------------------------------------
564 | refresh_token client_credentials
515 | refresh_token client_credentials
565 | refresh_token client_credentials
621 | authorization_code password refresh_token
8. Update authGrantTypes of OAuth2Client records by removing "refesh_token" value from OAuth2Client.authGrandTypes column.
UPDATE "OAuth2Client" SET "authGrantTypes"='client_credentials' WHERE id IN (idClientN1, idClientN2);
In following example, if the original value of "OAuth2Client.authGrantTypes" is 'authorization_code password refresh_token', then the updated value should be 'authorization_code password'.
Similarly, if the original value of "OAuth2Client.authGrantTypes" is refresh_token client_credentials ', then the updated value should be 'client credentials’
UPDATE "OAuth2Client" SET "authGrantTypes"='client_credentials' WHERE id IN (564,515,565);
UPDATE "OAuth2Client" SET "authGrantTypes"='authorization_code password' WHERE id IN (621);
9. After update queries are executed, run the SELECT query again in step 7 and the result should be like the following:
SELECT id, "authGrantTypes" FROM "OAuth2Client" WHERE id IN (idClientN1, idClientN2) ORDER BY "authGrantTypes" DESC;
SELECT id, "authGrantTypes" FROM "OAuth2Client" WHERE id IN (564,515,565,621) ORDER BY "authGrantTypes" DESC;
id | authGrantTypes
-----+----------------------------
564 | client_credentials
515 | client_credentials
565 | client_credentials
621 | authorization_code password
10. Following the update, the OAuth2RefreshToken table should stop quickly increasing the disk space size.
11. Then truncate the OAuth2RefreshToken again by following steps in https://knowledge.broadcom.com/external/article?legacyId=87609, and monitor the database disk consuming speed and it should NOT grow fast.
Note: For Database related changes, please take same time snapshot without memory for vIDM nodes in the cluster. In case of a situation requiring rollback to Snapshot, please engage VMware Support
1. Upgrade vIDM to 3.3.7
2. Post upgrading to 3.3.7 Refresh Token can be disabled
3.Need to check if the refresh token is disable for all vRops and NsXT.
To uncheck Refresh Token by navigating to vIDM Admin UI --> Catalog --> Settings --> Remote App Access. Also, prior any new integration, it is advised to check the list of existing clients as the naming convention is not helping to easily identify or associate the oauth2client with integrated product.
Note: The Refresh Token in the OAuth Client can be disable. When Session Cookie is not used, Application may generate excessive unnecessary login requests to vIDM and cause OAuth2RefreshToken table grow fast and consume huge disk space. Therefore, we recommend disabling refresh token and meanwhile to extend access token TTL from 15 mins to 10 hours.
1. Change the monitoring account for vROps, NSX-T from Domain admin user to Local admin user.
after that customer need to run service horizon-workspace restart on all IDM nodes appliance and can check if the disk space start to reduce.
https://knowledge.broadcom.com/external/article?legacyId=87609
Impact/Risks:
If the /db partition is full, all system operations will halt.