How to rotate and renew expired Contour secrets (contourcert and envoycert) in NAPP 3.2.x
search cancel

How to rotate and renew expired Contour secrets (contourcert and envoycert) in NAPP 3.2.x

book

Article ID: 345866

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

The purpose of this KB is to provide a detailed guide on how to rotate and renew expired secrets (contourcert and envoycert) under projectcountour namespace in NAPP 3.2.x.  It includes a step-by-step workaround, prerequisites, and verification steps for managing NAPP environments.

Symptoms:
  • NAPP fails to load Features and Upgrade section of UI.
  • Envoy (pods with name starting with "projectcontour-envoy-") pods under projectcontour namespace show restarts.

Example

# napp-k get pods -n projectcontour  
             
NAMESPACE        NAME                                     READY  STATUS    RESTARTS  AGE  
projectcontour   projectcontour-contour-c476b465b-b5ttr    1/1   Running    1        23h  
projectcontour   projectcontour-contour-c476b465b-qhwv9    1/1   Running    1        23h                    
projectcontour   projectcontour-envoy-bnp2c                1/2   Running    161      23h          
projectcontour   projectcontour-envoy-p5zcw                1/2   Running    161      23h      
projectcontour   projectcontour-envoy-t9ddp                1/2   Running    162      23h         
          
  • The secrets (contourcert and envoycert) in projectcontour namespace are expired.

    Example:
  • # napp-k get secret -n projectcontour envoycert -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates
    notBefore=Jul 5 19:11:31 2022 GMT
    notAfter=Jul 6 19:11:31 2023 GMT
    
    # napp-k get secret -n projectcontour contourcert -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates
    notBefore=Jul 5 19:11:31 2022 GMT
    notAfter=Jul 6 19:11:31 2023 GMT

     

  • The log for envoy pods in the projectcontour namespace (pods with name starting with "projectcontour-envoy-") reports TLS errors, such as:
# napp-k logs -n projectcontour <envoy pod name>
[2023-09-11 16:32:01.575][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:101] StreamListeners gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED

Note - The examples in this KB use the 'napp-k' command instead of 'kubectl', as they are intended to be executed from the NSX Manager. If you need to run these commands outside of the NSX Manager appliance, modify them by replacing 'napp-k' with 'kubectl --kubeconfig=/config/vmware/napps/.kube/config'


Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x

Cause

The secrets (contourcert and envoycert) are used for internal gRPC communication between Contour and Envoy.
These certificates were generated with the validity of 365 days in 3.2.x release due to a limitation in the inluded version of Contour.

Resolution

NAPP version 4.x.x uses the version of the Contour that allows for a longer default validity period of 10 years for the secrets/certificates. Upgrading to 4.x.x will automatically will renew these certificates extending the duration to 10 years.

Workaround:

Rotate contour secrets (contourcert and envoycert) with below workaround, validity of new contourcert and envoycert will be 10years.

contour_certgen_job.yaml  file is attached to this KB. 
MD5 = 4f4b2748d2ac9cb19a112b743533a8c7
Confirm that the md5sum of the file matches with the above hash by running -> md5sum contour_certgen_job.yaml 

Confirm that the MD5 of the download file matches with the above - In Windows, open command prompt  navigate to directory containing the file and run the command --> CertUtil -hashfile contour_certgen_job.yaml MD5
If the md5sum differs, then re-download the script. 

Example from Windows Command utility

C:\Users\username>CertUtil -hashfile contour_certgen_job.yaml MD5
MD5 hash of contour_certgen_job.yaml:
4f4b2748d2ac9cb19a112b743533a8c7
CertUtil: -hashfile command completed successfully.


On NSX Manager node, the file integrity can be checked by running the command --> md5sum contour_certgen_job.yaml

Update the contour <version> and <image_version> under all the resources defined in "contour_certgen_job.yaml" which has following tags:

  • "helm.sh/chart: contour-<version>"
  • "image: projects.registry.vmware.com/nsx_application_platform/clustering/third-party/contour:<image_version>"
  • You can get the current version by running "kubectl describe" on contour deployment/pods.
  • In the manager support bundle, "napp/projectcontour/describe_all_result" shows the contour version and image version.

Run the following commands from manager node to rotate/renew the expired contourcert/envoycert (you can run these commands on the control plane node, there is no need to pass kubeconfig option in that case):

Note: “napp-k” is an alias for “kubectl --kubeconfig=/config/vmware/napps/.kube/config”

1. Take backup:
    napp-k get secret -n projectcontour contourcert -o yaml > contourcert-backup.yaml
  napp-k get secret -n projectcontour envoycert -o yaml > envoycert-backup.yaml


2. Delete current contourcert and envoycert:
   napp-k delete secret -n projectcontour envoycert contourcert

3. Generate new contourcert and envoycert :
    napp-k apply -f <path>/contour_certgen_job.yaml -n projectcontour

4. Verify that the new contourcert/envoycert was generated:
    napp-k get secret -n projectcontour

5. Verify that the validity period of new secrets is 10 years:
  napp-k get secret -n projectcontour envoycert -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates
  napp-k get secret -n projectcontour contourcert -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates

6. Restart all contour pods:
   napp-k patch deployment projectcontour-contour -n projectcontour -p '{"spec": {"template": {"metadata": {"labels":{"test": "restart"} } } } }' --type=merge    

7. Restart all envoy pods:
  napp-k patch daemonset projectcontour-envoy -n projectcontour -p '{"spec": {"template": {"metadata": {"labels":{"test": "restart"} } } } }' --type=merge

8. Delete projectcontour-contour-certgen job if it is present:
  napp-k get job projectcontour-contour-certgen -n projectcontour 

    If the output of above command shows the projectcontour-contour-certgen job, then run below command to delete it: 
  napp-k delete job projectcontour-contour-certgen -n projectcontour

9. Verify all contour/envoy pods are in running state and not restarting:
  napp-k get pods -n projectcontour


Attachments

contour_certgen_job get_app