Apps Manager may appear slow or laggy when a single Apps Manager instance enters a bad DNS resolution state. In this condition, the affected instance logs repeated nginx name resolution timeout errors for internal foundation endpoints, while other Apps Manager instances may remain healthy.
The issue may be isolated to one instance, for example APP/REV/1/PROC/WEB/5, and log entries on the affected instance show errors similar to:
[error] 143#0: *123456 <hostname> could not be resolved (110: Operation timed out)api.<system-domain>log-cache.<system-domain>app-usage.<system-domain>In some cases, manual connectivity tests such as curl from within the affected container may still succeed, and elevated CPU usage may also be observed on the affected instance.
VMware Tanzu Application Service
The issue is likely caused by the internal Nginx resolver within a specific Apps Manager instance becoming unresponsive or hanging. This results in the Nginx process failing to resolve the system domains required to proxy requests, leading to 30-second timeouts (110: Operation timed out) and high CPU as the process attempts to handle the backlog of stalled requests.
To restore healthy behavior, the affected Apps Manager instance must be restarted. This clears the Nginx runtime state and re-initializes the internal resolver.
1. Review the Apps Manager logs to identify which specific instance index is reporting the `110: Operation timed out` errors.
2. Note the instance index (e.g., `5`).
cf restart-app-instance apps-manager-js-blue <instance_index>*Note: Replace `apps-manager-js-blue` (or `green`) with the correct app name and `<instance_index>` with the index identified in the logs.*
If the issue recurs, collect troubleshooting data before restarting the instance:
bosh-dns logsrep logsgarden logs