Error: "''log4j:ERROR Failed to flush writer''" after vcd service crash
search cancel

Error: "''log4j:ERROR Failed to flush writer''" after vcd service crash

book

Article ID: 399867

calendar_today

Updated On:

Products

VMware Cloud Director VMware Telco Cloud Platform

Issue/Introduction

  • The vcd service may fail to start or crash shortly after initialization.

  • A review of the cell.log file located in /opt/vmware/vcloud-director/logs/ reveals that the service encounters a java.lang.OutOfMemoryError, followed by a failed heap dump due to insufficient disk space.

  • The cell-runtime.log also includes the error message:

    log4j:ERROR Failed to flush writer, java.io.IOException: No space left on device

    This issue results in the VCD service becoming non-operational and may impact tenant access or administrative functionality within Cloud Director.

  • There can be instances where the VCD services might not crash but the partition might be close to full and the database size increases very fast. 

Environment

  • vCD: 10.3.3, 10.6.0.1
  • TCP: 4.0 , 4.0.1,  5.0, 5.0.1

Cause

  • The root cause was identified as storage exhaustion due to uncontrolled growth of the audit_trail table in the Cloud Director database.
  • This table records all user and system activities, which can grow significantly in environments with integrations such as Container Service Extension (CSE) or Aria Operations.

Resolution

  • Note: MUST take snapshots of all VCD cells and perform a database backup before proceeding.
  • Note: This action requires downtime on all Cloud Director cells. 

Take a backup of the Cloud Director database:

Reduce the current size of audit_trail table :

  1. Stop services on all cells:

    /opt/vmware/vcloud-director/bin/cell-management-tool cell -i $(service vmware-vcd pid cell) -s

  2. Connect to the database:

    sudo -i -u postgres psql vcloud

  3. Clear the audit_trail table:

    truncate table audit_trail;

  4. Reclaim space and optimize the database:

    vacuum full;
    vacuum analyze;

  5. Start the VCD service on the first cell:

    systemctl start vmware-vcd

  6. Start VCD on remaining cells after the first cell is online.

Configure automatic audit event cleanup:

To avoid future uncontrolled growth, configure Cloud Director to retain audit logs for a limited time (e.g., 10 days):

/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n com.vmware.vcloud.audittrail.history.days -v 10