This KB article goes over a scenario in which you receive a Unable to interpolate credhub references: Post "https://credhub.service.cf.internal:8844/api/v1/interpolate": remote error: tls: unknown certificate
error upon pushing an application or binding an application to a service.
Here is an example error message that we may encounter in the application logs (via the command cf logs $APP_NAME --recent
):
2025-02-04T14:16:21.730-06:00 [STG/0] [ERR] Unable to interpolate credhub refs: Unable to interpolate credhub references: Post "https://credhub.service.cf.internal:8844/api/v1/interpolate": remote error: tls: unknown certificate
Tanzu Application Service
The causes of this issue could be multiple things. Here are a couple of checks we can do to narrow down the cause of this issue:
https://credhub.service.cf.internal
https://credhub.service.cf.internal
endpoint1. We can SSH into a Diego Cell VM and take a look at the Credhub certificate:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh diego_cell/0
2. Next, we can run openssl to view the certificate:
openssl s_client -connect credhub.service.cf.internal:8844 -showcerts << EOF
POST /api/v1/interpolate HTTP/1.1
Host: credhub.service.cf.internal
Content-Type: application/json
Connection: close
Content-Length: LENGTH_OF_BODY
{
"credentials": "testing"
}
EOF
A successful result should look like this:
3. Next, we can copy the certificate and decode it using https://www.sslshopper.com/certificate-decoder.html
If the certificate is expired, then we will need to replace it via a leaf certificate rotation per this documentation
To check for issues with Credhub accepting the application container certificate, we can start by pushing a Credhub test application as a process that will continuously send POST requests to the credhub.service.cf.internal
endpoint.
1. We start by downloading the Credhub test application located here (https://github.com/tsagai/credhub-test-app)
git clone https://github.com/tsagai/credhub-test-app.git
2. Change into the project directory:
cd credhub-test-app
3. Push the application as a process:
cf push -u process
4. After the application has pushed successfully, we can check the application logs
cf logs credhub-test-app
We may see some SSL related errors that look like this when the test application does a test POST
to the Credhub https://credhub.service.cf.internal endpoint:
2025-02-14T10:29:14.05-0600 [APP/PROC/WEB/0] ERR * OpenSSL SSL_read: error:0A000416:SSL routines::sslv3 alert certificate unknown, errno 0
2025-02-14T10:29:14.05-0600 [APP/PROC/WEB/0] ERR 100 20 0 0 100 20 0 256 --:--:-- --:--:-- --:--:-- 259
2025-02-14T10:29:14.05-0600 [APP/PROC/WEB/0] ERR * Closing connection 0
2025-02-14T10:29:14.05-0600 [APP/PROC/WEB/0] ERR curl: (56) OpenSSL SSL_read: error:0A000416:SSL routines::sslv3 alert certificate unknown, errno 0
5. Next, we can grab the application instance certificate for the credhub-test-app application named instances.crt
, and print it out to a file:
cf ssh credhub-test-app -c 'cat /etc/cf-instance-credentials/instance.crt' > instances.crt
6. Transfer the instances.crt
to one of the Credhub VMs. If you have generated the instances.crt
on a machine separate from the Ops Manager VM (e.g your local machine), you will need to transfer the crt file to Ops Manager VM. If you are already logged into your Ops Manager VM, you can skip to step 6c.
instances.crt
to the Ops Manager VM, where <YOUR OPSMAN PRIVATE KEY> is your private SSH key for the Ops Manager and $IP is the IP address of your Ops Manager VM:scp -i ~/.ssh/<YOUR OPSMAN PRIVATE KEY> /path/to/instances.crt ubuntu@$IP:/home/ubuntu/instances.crt
ssh -i ~/.ssh/<YOUR OPSMAN PRIVATE KEY> ubuntu@$IP
instances.crt
to the Credhub VM:bosh -d $(bosh ds --column=name | grep ^cf-) scp ./instances.crt credhub/0:/tmp/instances.crt
7. SSH into the Credhub VM:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh credhub/0
8. Become root:
sudo su -
9. Source the var-store
:
source /var/vcap/jobs/credhub/tmp/var-store
10. Generate the Diego CA certificate:
/var/vcap/data/packages/openjdk_17.0/*/jre/bin/keytool -list -rfc -keystore /var/vcap/data/jobs/credhub/*/config/mtls_trust_store.jks -alias mtls_ca-0-0 -storepass $MTLS_TRUST_STORE_PASSWORD >> /tmp/diegoCA.crt
We need to clean up the Diego CA certificate as there is residual text from the generation. It will initially look something like this when editing it in vim:
We need to edit the diegoCA.crt
file so that it just contains the certificate and nothing else:
11. Browse to https://www.sslshopper.com/certificate-decoder.html and decode the certs inside of the instances.crt
file. Identify the "Diego Instance Identity Intermediate CA" certificate. The Diego Instance cert should look something like this:
12. Copy the Diego Instance identify intermediate CA, and paste it into the DiegoCA.crt
file underneath the first certificate. DiegoCA.crt
should now contain 2 certificates.
13. Run the following command to validate the application instance certificate against the Diego certs:
openssl verify -CAfile /tmp/diegoCA.crt /tmp/instances.crt
If successful, we should see the following output:
14. If you get a separate output, it indicates that the Diego intermediate CAs may be expired. We can confirm by decoding the certs in diegoCA.crt
, and if they are expired, a full root CA rotation would need to be done per this documentation.
We can gather the Credhub VM logs and overlap the timestamps from the application logs and the Credhub VM logs, specifically the credhub.log
, credhub.stdout.log
, and credhub.stderr.log
log files:
bosh -d $(bosh ds --column=name | grep ^cf-) logs credhub
For example, if we have the following log snippet similar to that of below with the timestamp of 2025-02-04T14:16:21.730-06:00, we can convert this timestamp to UTC time which would be 2025-02-04T20:16:21.730Z.
2025-02-04T14:16:21.730-06:00 [STG/0] [ERR] Unable to interpolate credhub refs: Unable to interpolate credhub references: Post "https://credhub.service.cf.internal:8844/api/v1/interpolate": remote error: tls: unknown certificate
We can search for the timestamp 2025-02-04T20:16:21.730Z in the credhub.log
, credhub.stdout.log
and credhub.stderr.log
files to see if we find any error messages that may reveal more information on this issue.
One cause of this error could be having multiple NTP server VMs applied across a given TAS foundation. For instance, if the Credhub VMs and Diego Cell VMs are using 2 different NTP servers, and the time drift between them exceeds a few minutes, it can lead to the Credhub VM rejecting the application container instance certificate (That lives on the Diego Cell housing the application container), and causing the https://credhub.service.cf.internal:8844/api/v1/interpolate": remote error: tls: unknown certificate
error.
To confirm this particular cause, we can start by print out the date and NTP server configuration via these commands:
date
on all VMs included in the TAS deployment. bosh -d $(bosh ds --column=name | grep ^cf-) ssh -c 'date' | grep -v 'Running' | grep -v 'Task' | grep 'stdout'
Ideally, the time drift between the VMs is minimal at around 1-2 seconds:
However, if this issue is present, we may see a time drift of 2-3 minutes between some VMs, particularly between the Credhub and Diego Cell VMs like below. As we can see, the time difference between the Diego Cell VMs (which have a time of 07:59:32) and Credhub VM (which has a time of 07:57:34) is nearly ~2 minutes.
We can also try to restart the chrony service to resync the NTP service amongst the TAS deployment VMs:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh -c 'sudo systemctl restart chrony.service'
Additionally, if we check the NTP servers, we we may see that there are 2 different NTP servers applied. We can check this via the command below:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh -c 'sudo cat /var/vcap/bosh/etc/ntpserver'
As seen in the screenshot below, we see that some of the VMs have an NTP server IP of IP1.###.###.### (particularly the Diego Cell VMs) shown by the green arrows, whereas other VMs are using NTP server IP IP2.###.###.###. This mixing of NTP servers should not occur, and the NTP server for each VM should be uniform, meaning that in this example, we should be seeing that either all VMs would have IP1.###.###.### as their NTP server or either all VMs would have IP2.###.###.### as their NTP server.
For reference, the VMs using NTP server IP2.###.###.### are shown with the green arrows in the screenshot below. All other VMs that are not shown with green arrows are using NTP server IP1.###.###.###
https://credhub.service.cf.internal
If we found that the credhub.service.cf.internal certificate is expired while running through the "Checking for Issues with the certificate for https://credhub.service.cf.internal" section at step 3, then we will need to replace it via a leaf certificate rotation per this documentation
If we get a separate output aside from "OK" at step 14 running through the "Check for Issues with Credhub accepting the application container certificate" section, it indicates that the Diego intermediate CAs may be expired. We can confirm by decoding the certs in diegoCA.crt
, and if they are expired, a full root CA rotation would need to be done per this documentation.
If we notice that there are multiple NTP VMs being applied to the VMs in the TAS deployment, we will want to, first, go to the BOSH tile > Director Config, and then scroll down to the "Recreate VMs deployed by the BOSH Director" checkbox, and make sure it is checked.
Next, we may want to assign a separate NTP server for the Director config. In the screenshot below, we are using time.google.com
, however, we can use the likes of pool.ntp.org
, time.nist.gov
, etc.
After checking off the "Recreate VMs deployed by the BOSH Director" checkbox, and applying a new NTP server(s), we can click the "Save" button, and trigger an Apply Changes on the BOSH tile to correct the NTP syncing issue.
To confirm that the issue has been resolved, we can find the time that we have now on the TAS deployment VMs:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh -c 'date' | grep -v 'Running' | grep -v 'Task' | grep 'stdout'