Applications Return HTTP 503 Certificate Verification Errors During Rolling Restarts and Deployments
search cancel

Applications Return HTTP 503 Certificate Verification Errors During Rolling Restarts and Deployments

book

Article ID: 386660

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

During an operation that triggers a container restart (crashing, rolling app deployments/restarts, etc) some requests may fail with an HTTP 503 certificate error:

Example log message:
x_cf_routererror:"endpoint_failure (tls: failed to verify certificate: x509: certificate is not valid for any names, but wanted to match <GUID>)"

Cause

This is due to the current gorouter retry logic for apps with 1 or 2 instances.

Currently if a request fails to connect to an app instance then gorouter with attempt to retry the request up to 3 times by default. However, if there are only 2 app instances, then it will only attempt 2 times. If there is only 1 app instance, then it will only attempt once.

Resolution

Routing release version 0.336.0 contains a fix improve the gorouter retry logic. For details:

https://github.com/cloudfoundry/routing-release/releases/tag/v0.336.0 

This issue should only happen rarely for stable apps that are not crashing often.

If you are seeing this issue a lot and it's causing problems, then increasing the app instances to 3 or more will greatly reduce the chance of hitting this issue.

Upgrade to TAS/TPCF version 6.x and later contains a fix on this issue