/config partition disk usage high and very high Alarm
search cancel

/config partition disk usage high and very high Alarm

book

Article ID: 330581

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm for /config partition disk usage high and very high
Event ID: manager_health.manager_config_disk_usage_high, manager_health.manager_config_disk_usage_very_high

Alarm Description

  • Purpose: Corfu database disk usage reaches limits
  • Impact: This might be due to temporary heavy workload in NSX manager and the alarm will be cleared automatically after a few hours. If the high disk usage persists for a few days, Corfu database might experience clustering instability and performance degradation. If the disk usage keeps growing, the database will turn to read-only mode and further configurations in NSX manager will fail.

Environment

VMware NSX-T Data Center

Resolution

Steps to Resolve

For 3.0.0 and higher

Resolution:

  • For 4.1.1 and lower, keep monitoring until disk usage goes down automatically and the alarm is cleared in the next a few hours.
  • For 4.1.2 and later, run CorfuServer ODS Runbook from CLI to get more insight into underlying infra and Corfu compactor health. Note that this command might take up to 5 mins to return, and it might fail if SHA is not in a healthy state.

    manager1> start invocation runbook CorfuServer

Note: ODS Runbook above needs to be run as admin on 1 of the managers in the cluster.

Look for the result of step 4. The following example shows that Compactor is working fine.

Step Number : 4
Step Action : Check trim token movement in the given time window (default is 24h)
Step Result : The result of the Corfu Trim Token Movement Check is {'result': <Result.SUCCESS: 'SUCCESS'>, 'message': 'Detected a successful log trim.', 'data': {'last_trim_date': '2024-04-18T19:02:40.256Z'}}

If the alarm persists for a few days, or the ODS Runbook complaints about trim token not moving, contact VMware Support.

Maintenance window required for remediation? Yes

  • For 4.2.1 check if your NSX managers are encountering the known JDK memory issue which can impact Corfu DB compaction functionality. The relevant KBs are KB 396719 and KB 390592.