I have an on-premise API Gateway which is integrated with SaaS API Management ("SaaS Portal"). I am suddenly receiving certificate errors.
I am seeing certificate CN values returned in the error message that do not match what I am expecting.
An example of such an error is as follows:
There are issues with the site's certificate chain (net::ERR_CERT_COMMON_NAME_INVALID). 2019-03-18T17:33:00.035-0400 WARNING 148 com.l7tech.server.policy.assertion.ServerHttpRoutingAssertion: 4042: Problem routing to https://sync-apim-usstg2-useast1.app.prod.e1.dev.ca.com:443/sync/v1/applications?identifier=<UUID>&nodeId=<nodeID>-<UUID>&start=1552745740981. Error msg: Unable to obtain HTTP response from https://sync-apim-usstg2-useast1.app.prod.e1.dev.ca.com:443/sync/v1/applications?identifier=<UUID>&nodeId=<nodeID>-<UUID>&start=1552745740981: java.security.cert.CertificateException: Certificate [cn=<domain_name>] path validation and/or revocation checking failed. Caused by: Certificate [cn=<domain_name>] path validation and/or revocation checking failed
SaaS Portal is hosted on AWS infrastructure. A result of working in AWS is frequent IP address changes for the AWS hostnames. The API Gateway (currently) by default stores it's DNS cache indefinitely, until a restart of the SSG service occurs. This means when an IP address changes in the AWS infrastructure, the API Gateway will continue talking to the now out-of-date IP address for the hostname used in the routing assertion, and will return an untrusted SSL certificate as a result.
On-premise API Gateway integrated with SaaS Portal. This is also known as a "Hybrid Deployment".
When running an API Gateway with SaaS Portal, it is recommended to change the DNS Cache lifetime (TTL) to a low value as opposed to the default value of indefinite.
For the short term, a quick restart of the Gateway service will resolve the issue as it will force a flush of the DNS cache.
The KB article to change the DNS TTL in the API Gateway is noted in KB000012118 and should be followed to resolve this issue for the long term and applied to all nodes in the cluster.