"526 Invalid SSL Certificate" error was returned when accessing Apps Manager, as shown in below image:
In Apps Manager logs it also complained expiration of either instance or CA certificate:
By decoding the Diego Instance Identity Root CA certificate configured (on gorouter vm) in /var/vcap/jobs/gorouter/config/gorouter.yml file we found that the CA certificate already expired. The expired CA cert is referenced as ((diego-instance-identity-intermediate-ca-2-7.ca)) in cf deployment manifest file.
However all certificates on the TAS foundation were already rotated not a long time ago and no warnings shown on Ops Manager web UI about expiring/expired certificates. When looking at the certificate which signs this cert, diego-instance-identity-root-ca-2-6, the expiration date is different then what we see here. This is the source of the problem.
When reviewing certificate diego-instance-identity-intermediate-ca-2-7 in Credhub (credhub get -n diego-instance-identity-intermediate-ca-2-7, look at the "ca" section returned) , it's found this cert's CA certificate was set to an expired version CA cert /cf/diego-instance-identity-root-ca-2-6, which is not correct.
We can see this situation occurring in 2 separate scenarios.
Scenario 1
#Here is a code snippet from the maestro_topology file included in the ops-manager support bundle
The intermediate-ca-2-7 is being signed by a root-ca-2-6 that is no currently engaged - name: "/cf/diego-instance-identity-root-ca-2-6" certificate_id: ce453c85-8533-4f38-9271-5a846968b35f signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: f872c68c-491b-4d37-ac23-98b6e568b511 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: false <--- this would be "true" if signed_by_version below recorded that this cert signed a cert below. transitional: false <-- no transitional flag certificate_authority: true generated: true - version_id: 33a64630-1155-4841-98ad-c22531f1e218 active: false <-- not the currently active CA signed_by_version: '' deployment_names: <-- not showing as in use by a deployment signing: true <-- indicative of signing a cert below transitional: false <-- no transitional flag certificate_authority: true generated: true valid_until: '2024-11-28T20:26:30Z' signs: - name: "/p-bosh/cf-000000000000000000/diego-instance-identity-intermediate-ca-2-7" certificate_id: 3c129541-cef1-4832-87fb-2f3ebe3aec6c signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: 03f2bd11-f574-491c-842c-35ea6d78e965 active: true signed_by_version: '33a64630-1155-4841-98ad-c22531f1e218' <---- This should be the newest version of the above root-ca-2-6. not the older non-transitional version deployment_names: - cf-000000000000000000 signing: false transitional: false certificate_authority: true
In the above you can see that the Intermediate-ca-2-7 is being signed by a root-ca-2-6 version, but is not currently engaged in a "3 round transitional" rotation as seen by no "transitional: true" flag. This can occur when "brute force", or single round, rotations are utilized.
Scenario 2: signed_by_version is blank for diego-instance-identity-intermediate-ca-2-7
#Here is the a snippet from the maestro_topology file included in the ops-manager support bundle
The intermediate-ca-2-7 is signed by a version of the root-ca-2-6 certificate that had been deleted. That is why the "signed_by_version" field is empty in this output. - name: "/cf/diego-instance-identity-root-ca-2-6" certificate_id: ce453c85-8533-4f38-9271-5a846968b35f signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: f872c68c-491b-4d37-ac23-98b6e568b511 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: false <--- this would be "true" if signed_by_version below recorded that this cert signed a cert below. transitional: false <-- no transitional flag certificate_authority: true generated: true signs: - name: "/p-bosh/cf-000000000000000000/diego-instance-identity-intermediate-ca-2-7" certificate_id: 3c129541-cef1-4832-87fb-2f3ebe3aec6c signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: 03f2bd11-f574-491c-842c-35ea6d78e965 active: true signed_by_version: '' <---- this should have featured a root-ca-2-6 guid. if it is blank, that means the version that signed it is gone deployment_names: - cf-000000000000000000 signing: false transitional: false certificate_authority: true
In the above you can see how there is no version available in credhub which matches the metadata for this intermediate-ca-2-7 certificate. When this happens it will not be able to locate the cert for reference in this output. This can occur when brute force rotations and deletions are performed using credhub directly.
Additional troubleshooting: If you have mutliple versions available and wish to see the history of the rotations, you can use:
- credhub curl -p "/api/v1/certificates/<certificate_id>/versions"
How to repair scenario 1:
In scenario 1 we still have access to the CA which performed the signing. We can label the newest CA as transitional, perform an AC to make sure the new intermediate-ca-2-7 CA metadata is able to be utilized, and then resume the rotation of this CA using maestro. Here are the steps:
1. Using credhub directly we wish to place the transitional flag on the newest CA
maestro topology --name /cf/diego-instance-identity-root-ca-2-6 # example output - name: "/cf/diego-instance-identity-root-ca-2-6" certificate_id: ce453c85-8533-4f38-9271-5a846968b35f signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: f872c68c-491b-4d37-ac23-98b6e568b511 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: false transitional: false <-- no transitional flag certificate_authority: true generated: true - version_id: 33a64630-1155-4841-98ad-c22531f1e218 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: true transitional: false <-- no transitional flag certificate_authority: true generated: true valid_until: '2024-11-28T20:26:30Z' credhub curl -p /api/v1/certificates/ce453c85-8533-4f38-9271-5a846968b35f/update_transitional_version -d '{"version": "f872c68c-491b-4d37-ac23-98b6e568b511"}' -X PUT # You can confirm that now the two CA certificate versions have swapped: maestro topology --name /cf/diego-instance-identity-root-ca-2-6 # example output - name: "/cf/diego-instance-identity-root-ca-2-6" certificate_id: ce453c85-8533-4f38-9271-5a846968b35f signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: f872c68c-491b-4d37-ac23-98b6e568b511 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: false transitional: true <-- transitional flag certificate_authority: true generated: true - version_id: 33a64630-1155-4841-98ad-c22531f1e218 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: true transitional: false <-- no transitional flag certificate_authority: true generated: true valid_until: '2024-11-28T20:26:30Z'
2. Apply change on all diego component tiles (TAS / TAS Iso / TAS Windows).
3. Resume 3 round "single_ca" rotation at step 3 https://docs.vmware.com/en/VMware-Tanzu-Operations-Manager/2.10/vmware-tanzu-ops-manager/security-pcf-infrastructure-advanced-certificate-rotation.html#rotate-a-single-ca-and-its-leaf-certificates-7
maestro update-transitional signing --name "/cf/diego-instance-identity-root-ca-2-6" # example output - name: "/cf/diego-instance-identity-root-ca-2-6" certificate_id: ce453c85-8533-4f38-9271-5a846968b35f signed_by: "/cf/diego-instance-identity-root-ca-2-6" versions: - version_id: f872c68c-491b-4d37-ac23-98b6e568b511 active: true signed_by_version: '' deployment_names: signing: false transitional: false <-- swapped to false certificate_authority: true generated: true - version_id: 33a64630-1155-4841-98ad-c22531f1e218 active: true signed_by_version: '' deployment_names: - cf-000000000000000000 signing: true transitional: true <-- flag swapped to true certificate_authority: true generated: true valid_until: '2024-11-28T20:26:30Z'
4. Apply change on all diego component tiles (TAS / TAS Iso / TAS Windows).
5. Complete the rest of the rotation using the linked maestro procedure
You may also utilize the solution for scenario 2 below.
How to repair scenario 2:
Unfortunately in scenario 2 we do not have access to the original "/cf/diego-instance-identity-root-ca-2-6" which signed the "diego-instance-identity-intermediate-ca-2-7" being utilized and won't be able to restore it use. What remains is a single "Brute-force" style rotation. If you are already experiencing an issue with Ops Manager due to an expired cert this will be your fasted method for recourse:
1. Use credhub to perform a 1 round rotation of the intermediate-ca-2-7
credhub regenerate -n "/p-bosh/cf-000000000000000000/diego-instance-identity-intermediate-ca-2-7" # Making sure to repeat for each iteration of TAS / TAS Iso / TAS Windows
2. Apply change on all diego component tiles (TAS / TAS Iso / TAS Windows).