During application monitoring operations such as bootstrapping, activating application plugins, and updating content in the Cloud Proxy, the Salt process may become unresponsive.
This issue results in longer processing times and, ultimately, operational failures for end-users.
Common symptoms include error messages in the Salt master log.
Error Message 1Cloud Proxy VM: Path of the log file : /data1/ucp-salt/master
2024-02-26 17:08:14,632 [salt.client :1903][ERROR ][161279] Message timed out
2024-02-26 17:12:18,144 [salt.master :1642][ERROR ][130] Received minion error from [<UUID>_<MOID>]: The minion function caused an exception
2024-02-26 17:12:18,148 [salt.master :1642][ERROR ][133] Received minion error from [
<UUID>_<MOID>
]: The minion function caused an exception.
Error Message 22024-05-29 05:05:07,026 [salt.transport.zeromq:328 ][ERROR ][171547] ReqServer clients tcp://0.0.0.0:4506
2024-05-29 05:05:07,043 [salt.transport.zeromq:330 ][ERROR ][171547] ReqServer workers ipc:///var/run/salt/master/workers.ipc
Aria Operations 8.18.x
Solution 1: Restart the Salt Master container
To resolve this issue, stop and restart the `ucp-controlplane-saltmaster` container using the following commands:
ssh to cloud proxy virtual machine
docker stop ucp-controlplane-saltmaster
docker start ucp-controlplane-saltmaster
docker ps -a --filter "name =ucp-controlplane-saltmaster"
Solution 2: Recreate the Salt Master container.
If solution 1 does not work in your environment, you can attempt to recreate the container using the below steps
ssh to cloud proxy virtual machine
./ucp/ ucp-config-scripts/ucp-firstboot.sh -a cleanup_dockers
./ucp/ ucp-config-scripts/ucp-firstboot.sh
docker ps -a --filter "name =ucp-controlplane-saltmaster"
Recommendation:
After restarting the ‘ucp-controlplane-saltmaster’ container, if you need to perform ARC operations in bulk, it is recommended to do so in batches of 50 to prevent further issues.