Successfully installed Telegraf Agents randomly go into an unhealthy status across random VMs for 5-10 minute periods
search cancel

Successfully installed Telegraf Agents randomly go into an unhealthy status across random VMs for 5-10 minute periods

book

Article ID: 388113

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Successfully installed Telegraf Agents randomly go into an unhealthy status across random VMs for 5-10 minute periods, then returns to a healthy status.

Environment

Aria Operations 8.16 and above

Cause

ucp-minion collection cycle might not complete within 5 min.

Resolution

Workaround:

1. SSH to Cloud Proxy as root.

2. Go to /ucp/downloads/salt directory, then type command ll to list the contents of the directory .

    cd /ucp/downloads/salt

    ll

3. Take a note of the file permissions of ucp-minion.zip. See example below, the file permissions and owner are -rw-r--r-- and admin admin.

   

4. Make a backup of existing ucp-minion.zip

     mv ucp-minion.zip ucp-minion.zip_bkp

5. Download the attached ucp-minion.zip and then use WinSCP or other utilities to transfer it to /ucp/downloads/salt.

6. Make sure the permissions and owner for the zip file is the same as noted in step 3. If not, run the following commands to change the file owner and permissions.

    chown admin:admin ucp-minion.zip

    chmod 644 ucp-minion.zip

7. Go to Managed Telegraf Agents in the Aria Operations UI. Select all the Windows Servers on that CP and perform Agent Action "Update".

8. Wait for the Action to complete and make sure all the Windows Servers where the content upgrade is performed are having "Last Action" status as "Content upgrade success". Wait for 10-15 minutes to see the agent status, it should change from "Agent Unhealthy" to "Agent Running".

Attachments

ucp-minion.zip get_app