When one of the masters is down, minion grain items observed in RAAS may become inaccurate
search cancel

When one of the masters is down, minion grain items observed in RAAS may become inaccurate

book

Article ID: 321932

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:

Raas UI or API display incorrect/obsolete minion grains data.
For example in scenario:

a) we have initial data of the grains "atest" = "day1",
b) then one of the master was stop "master-2-new"
c) grains data was updated to the value "day2"
d) in RAAS UI we see data grain 'atest" item as "day2"

day2
 

e) but when we refresh UI again we see  'atest" item as "day1", which is invalid.

day1


Environment

VMware Aria Automation Config 8.12.x

Resolution

This is known issue and it will be fixed in 8.11.2.


Workaround:

There are two available workarounds for this issue:

1. Bring the offline master online and grains will be synced automatically
    systemctl start salt-master 

2. Delete the data corresponds to the offline master cache directly from RAAS database.
a) stop the RAAS service 
  systemctl stop raas

b) login to the RAAS database using psql

c) find the master uuid for offline master 
    Example: 

  SELECT uuid,master_id,last_seen FROM masters;
                 uuid                 |           master_id            |         last_seen
  --------------------------------------+--------------------------------+----------------------------
   d7b71040-3c16-4b66-819b-41f7311ac9bb | master22-new                   | 2023-01-16 11:09:21.017248
   25fe05bb-e3dd-404f-b7be-30ac2978c7c3 | saltstack_enterprise_installer | 2023-02-16 11:09:33.650634
  (2 rows)

  note the master's down UUID, in this case: d7b71040-3c16-4b66-819b-41f7311ac9bb

d) Delete cache data:

  begin;
  DELETE from minion_grains WHERE master_uuid = 'd7b71040-3c16-4b66-819b-41f7311ac9bb';
  DELETE FROM minion_cache WHERE master_uuid = 'd7b71040-3c16-4b66-819b-41f7311ac9bb';

     *******  if no errors are reported commit or rollback when psql report and issue
  commit;

d) start RAAS service

 systemctl start raas

Additional Information

Impact/Risks:

This issue is affecting Multi Master setup, HA HOT https://docs.saltproject.io/en/latest/topics/tutorials/multimaster.html and only when one of that master is down.
Issue is seen only when grains data was updated since the master is down.