TAP with cert-manager fills up etcd database
search cancel

TAP with cert-manager fills up etcd database

book

Article ID: 394694

calendar_today

Updated On:

Products

VMware Tanzu Platform - TAP

Issue/Introduction

The etcd database repeatedly exceeds the storage limit on a kubernetes cluster with TAP.

You may get the following error from kubectl commands: Error: etcdserver: mvcc: database space exceeded

After compacting and defragmenting the etcd database, it quickly becomes full again.

Following this step, you see a significant number of cert-manager keys

Example:

# etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key --endpoints https://etcd-0:2379 get /registry --prefix --keys-only | grep -v ^$ | awk -F '/'  '{ h[$3]++ } END {for (k in h) print h[k], k}' | sort -nr

250110 cert-manager.io
5921 events
579 configmaps
419 secrets
365 clusterroles
351 pods
338 sql.tanzu.vmware.com
251 tekton.dev
245 serviceaccounts
213 apiextensions.k8s.io
212 services
197 clusterrolebindings
193 internal.packaging.carvel.dev
161 replicasets

Environment

Tanzu Application Platform

Cause

This can be caused by conflicting certificate annotations on an Ingress, such as enabling ACME with a TAP self-signed certificate issuer.

Example:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: tap-ingress-selfsigned
    kubernetes.io/tls-acme: "true"
  name: myIngress
  namespace: myNamespace
spec:
  . . .

Self-signed certificate issuers don't support the ACME protocol. This results in many certificate requests getting created which fail repeatedly, filling up etcd.

Resolution