Manually compact etcd keyspace history when auto compaction fails due to database space exhaustion
search cancel

Manually compact etcd keyspace history when auto compaction fails due to database space exhaustion

book

Article ID: 323060

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • api-server logs would contain "etcdserver: mvcc: database space exceeded" which indicates storage space exhaustion.
  • Running etcdctl command from control plane VM to get endpoint status ("etcdctl endpoint status -w json") would report "alarm:NOSPACE" in the result


Environment

VMware vCenter Server 7.0.x

Cause

etcd maintains a history of its entire keyspace and this should be periodically compacted to avoid performance degradation and storage space exhaustion. auto compaction is enabled by default in the api-server. However within the compaction period, if etcd runs out of space (which might happen due to a burst of requests like a large number of pod creations), apiserver compaction never recovers.

Resolution

To resolve this issue, run the etcdctl commands manually from the control plane VM:
  1. Compact etcd:
etcdctl compact revision_number

The above command will discard all etcd event history prior to the given revision (revision_number).

Note: The current revision of etcd server can be found in the "revision" key by running the command: "etcdctl endpoint status -w json". Compute the revision_number to be compacted by subtracting a constant (like 10000) from the current revision.
  1. Disarm the alarm:
etcdctl alarm disarm.