You will notice that the Envoy pods are showing a status of 1/2, indicating that one container is failing. Upon further inspection, in envoy logs, you will observe gRPC-related errors:
kubectl logs -n tanzu-system-ingress envoy-xxxxx -c envoy
...
[./source/extensions/config_subscription/grpc/grpc_stream.h:193] StreamRuntime gRPC config stream to contour closed since 32s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
And in Contour logs, we can see that it is timing out when fetching gRPC config:
kubectl logs -n tanzu-system-ingress contour-xxxxx
...
extensions/config_subscription/grpc/grpc_subscription_impl.cc:130] gRPC config: initial fetch timed out for type.googleapis.com/envoy.service.runtime.v3.Runtime
TKGm 2.5.x
This issue occurs when the certificates used between Envoy and Contour expire. Specifically, the problem lies in either the contourcert or envoycert secrets in the tanzu-system-ingress namespace.
Envoy and Contour use mTLS to establish the gRPC configuration stream. If the CA or TLS certificate is expired, Envoy cannot authenticate Contour’s xDS server, causing Envoy pods to remain in a 1/2 (not ready) state.
To verify that certs are expired:
#List secrets in the namespace
kubectl get secret -n tanzu-system-ngress
#Inspect the secrets in yaml format (replace <secret-name> with contourcert or envoycert from previous output)
kubectl get secret -oyaml -n tanzu-system-ingress <secret-name>
#Copy the ca.crt or tls.crt value from the previous output(it will be base64-encoded)
#Decode the base64 string and check the certs validity
echo "<base64 string gathered previously>" | base64 -d | openssl x509 -text -noout
#Review the validity section of the output of both ca.crt and tls.crt and compare the results from both contourcert or envoycert secrets
#You may see a discrepancy like below
contourcert - ca.crt (EXPIRED)
Validity
Not Before: Jan 01 10:00:00 2024 GMT
Not After : Jan 01 10:00:00 2025 GMT
envoycert - ca.crt (VALID)
Validity
Not Before: Jan 01 10:00:00 2025 GMT
Not After : Jan 01 10:00:00 2026 GMT
Since these certificates are managed by cert-manager, the simplest fix is to delete the expired secrets. cert-manager will automatically regenerate fresh ones.
1. Backup the secrets
Always back up the existing secrets in case you need to review or restore them later
kubectl get secret contourcert -n tanzu-system-ingress -o yaml > contourcert-backup.yaml
kubectl get secret envoycert -n tanzu-system-ingress -o yaml > envoycert-backup.yaml
2. Delete the expired secrets
kubectl delete secret contourcert -n tanzu-system-ingress
kubectl delete secret envoycert -n tanzu-system-ingress
3. Wait for new secrets to be created
cert-manager will automatically reconcile and issue fresh certificates. You can verify with:
kubectl get secrets -n tanzu-system-ingress
Once the new secrets are created, Envoy will pick them up and recover on its own.