Telegraf Agent Stop Failed and Custom Script Output Not Displayed in Aria Operations
search cancel

Telegraf Agent Stop Failed and Custom Script Output Not Displayed in Aria Operations

book

Article ID: 422533

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  1. Telegraf agent operations fail to stop successfully.



  2. When we attempt to upgrade, the action fails, resulting in a "Content upgrade failed" error. However, the system's operational status is misleading, as the agent status still shows as 'running' and both the collection status and cycle indicate a 'green' or successful state but the script status shows as "not collecting".

Environment

8.18.x

Cause

  1. The Telegraf agent becomes unresponsive due to the Salt master service on the cloud proxy not running properly, which impacts agent communication and custom script execution.

    cd /arc/Logs/salt/saltMaster.tar_extracted/ucp-salt/master
    
    ####-##-## ##:##:##,### [salt.transport.zeromq:#### ][ERROR   ][###] ReqServer clients tcp://#.#.#.#:####
    ####-##-## ##:##:##,### [salt.transport.zeromq:### ][ERROR   ][###] ReqServer workers ipc:///var/run/salt/master/workers.ipc
    ####-##-## ##:##:##,### [salt.master:####][ERROR][###] Requested method not exposed: _prep_auth_info
    ####-##-## ##:##:##,### [salt.master:####][ERROR][###] Requested method not exposed: _prep_auth_info
  2. Existing product-managed Telegraf agents start, stop, and update actions fail with the following warning message.

    Agent Operation Failed: Please check the health of the Cloud Proxy and the Salt service. Retry the action if components are healthy. Agent operation failure, Control channel does not exist or could not be established. Check if VM null with UCPFQDN property CP-IP does not have a collector.

Resolution

Restart the Salt master service on the cloud proxy by running the ucp-firstboot script "sh /ucp/ucp-config-scripts/ucp-firstboot.sh". Once the Salt master is restored, reinstall or upgrade the Telegraf agent to ensure it is running and communicating correctly with Aria Operations. After reactivating a single custom script, verify that all related scripts are enabled and that search results correctly display custom script output with the custom script <vm name>.