Wavefront Nozzle's BOSH health metric exporter bosh-health-tas2to fails almost consistently
search cancel

Wavefront Nozzle's BOSH health metric exporter bosh-health-tas2to fails almost consistently

book

Article ID: 426309

calendar_today

Updated On:

Products

VMware Tanzu Application Service

Issue/Introduction

The Tanzu Observability by Wavefront Nozzle's BOSH Health Metric Exporter deployment "bosh-health-tas2to", which should get created and destroyed every 10 minutes, often fails when the Deployment Create phase fails while trying to create the "bosh-health-check" VM.  The VM fails to come online and left in an 'unresponsive agent' status.

When this occurs the Wavefront does not trigger any alert, and both the delete and create cycles fail, and all subsequent create/delete cycles also fail.

Deleting the unresponsive "bosh-health-check" VM does succeed some times, but the following deployment create/delete sometimes works, but sometimes not.  Deleting the "bosh-health-check" VM sometimes usually allows one or two deployment create/delete cycles to work, but then it fails again and will never self-correct.

SSH into the failing "bosh-health-check" VM or bosh log collection fails because the VM instance is in 'unresponsive agent' state.

Environment

VMware Tanzu Application Service

Tanzu Observability by Wavefront Nozzle

Cause

Duplicate IP address with the bosh-health-check VM.  There could be an event wherein the bosh-health-check VM was not deleted properly and left assigned with the same IP address, which caused a duplicate IP address the next time the bosh-health-check VM was created.

Resolution

Identify all the VM's that are using the same IP address that the bosh-health-check VM is using, and delete the VM's accordingly.  Once there is no more existing VM with the particular IP address, the next automatic deployment should be able to complete successfully.