[Internal] Salt minion cache cannot be saved properly after upgrading Aria Automation Config to 8.17
book
Article ID: 373037
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
The environment includes:
One salt master node installed on RHEL 8.9
One raas node installed on RHEL 8.9
One postgres node installed on RHEL 8.9
One redis node installed on RHEL 8.9
More than 400 Linux/Windows virtual machines with salt minion
The issue happens after upgrading Aria Automation Config from 8.16.2 to 8.17.
Salt minion grain is saved to postgresql database when there is salt-minion restart, which will trigger this problem.
Salt minion grain saving failure will lead to other salt feature's malfunction, e.g. failed to deploy virtual machine from Aria Automation due to salt failed to respond the request. It is because that raas stuck in trying saving minion grain cache repeatedly, so that other request cannot be responded by raas.
When issue happens, in /var/log/raas/raas log on raas node, there is "Failed to save updated minion cache from master" error as below:
2024-06-18 20:55:05,749 [raas.utils.rpc ][ERROR :216 ][Webserver:150466] Failed to save updated minion cache from master <master node FQDN> Traceback (most recent call last): File "sqlalchemy/engine/base.py", line 1256, in _execute_context self.dialect.do_executemany( File "sqlalchemy/dialects/postgresql/psycopg2.py", line 912, in do_executemany cursor.executemany(statement, parameters) psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type bytea CONTEXT: PL/pgSQL function minion_cache_grains_sign() line 3 at assignment
Environment
Aria Automation Config 8.17.x
Cause
This issue happens when there are large quantity of salt minions in the environment.
After upgrading to Aria Automation Config 8.17.x, saving large quantity of salt minion grain cache to postgresql can make raas service busy, then failed to respond to other request.
By now, the exact number of salt minion to trigger the problem is unclear.
Resolution
Adjust "sseapi_max_minion_grains_payload" in/etc/salt/master.d/raas.confon salt master node can resolve the problem.
The procedure is:
Login salt master node with SSH terminal
Backup the config file by command:
cp /etc/salt/master.d/raas.conf /tmp/raas.conf
Edit file /etc/salt/master.d/raas.conf:
remove "#" before "sseapi_max_minion_grains_payload" if it exists to make sure the configuration is not commented out.
change "sseapi_max_minion_grains_payload" to 100
save&quit.
On raas node, input command below to restart raas service:
systemctl restart raas
On salt master node, input command below to restart salt-master service: