Note: This issue is observed with Metric Registrar release v1.0.4. As a result, this can affect VMware Tanzu Application Service (TAS) for VMs versions v2.4.x, v2.5.x, v2.6.x, v2.7.0.
Pre-checks
- Metric Registrar is enabled (Ops Manager UI > PAS tile > Metric Registrar)
- Metric Registrar Smoke Test Errand ran successfully in a previous or current "Apply Change"
- Apps Manager shows that
metric-registrar-monitor
app is deployed under system
org and metric-registrar-monitor
space
Symptoms
The following symptoms have been observed when this issue happens:
- Intermittent failure in doing a
cf login
. Example failure seen when performing a cf login:
API endpoint: api.<system_domain>
FAILED
Server error, status code: 404, error code: 0, message
- Intermittent failure in doing a
cf push
. For example:
Error unmarshalling the following into a cloud controller error: 404 Not Found: Requested route ('api.<system_domain>') does not exist.
- Significant Slowness/Delay observed in running generic cf CLI commands. Example: cf logs, cf app etc.
Cause
Following sequence of events leads to symptoms described above:
- When Metric Registrar is enabled in VMware Tanzu Application Service (TAS) for VMs and the Metric Registrar Smoke Test errand runs successfully, the
metric-registrar-monitor
app gets deployed under system
org and in metric-registrar-monitor
space. A bug in the implementation of smoke test does not delete metric-registrar-monitor
app after the errand has been executed successfully. This ends up leaving the metric-registrar-monitor
app in running state. - The
metric-registrar-monitor
app ends up leaking some go routines and ends up making large amounts of requests (frequently) to Cloud Controller API (CAPI).