Fixing Read-only Disk for DSM PostgreSQL
search cancel

Fixing Read-only Disk for DSM PostgreSQL

book

Article ID: 425065

calendar_today

Updated On:

Products

VMware Data Services Manager

Issue/Introduction

A PostgreSQL database’s underlying disk may be automatically reverted to read-only by the kernel if it detects a filesystem inconsistency. When this occurs, PostgreSQL cannot accept writes until the issue is resolved.

At a high level:

  • PostgreSQL database appears stuck in InProgress state.

  • PostgreSQL write operations fail.

On the affected workload cluster node:

  • Filesystem is remounted read-only: EXT4-fs error: Detected aborted journal

     
    sudo dmesg | grep -iE "EXT4-fs error|aborted journal|remounted read-only"
    • If the disk is healthy: This will return nothing (empty output).

    • If the disk is corrupted: This will print lines such as "EXT4-fs error" or "aborted journal.

     

  • Ext4 journal on the PostgreSQL data volume (/dev/sdc) is aborted or corrupted.

  • Write operations fail even when the filesystem shows as read-write (rw mount).

  • Kernel logs repeatedly report journal or I/O errors.

Environment

VMware DSM (Data Services Manager) and PostgreSQL

Cause

Cause

This issue can be triggered by a variety of factors:

  1. Temporary I/O interruptions via vSphere / CSI driver.

  2. Unclean VM shutdown or reboot during active writes.

  3. Heavy write load or sudden PostgreSQL writes during transient storage errors.

  4. Rarely, underlying hardware or disk failure may contribute.

Resolution

Prerequisites 

- SSH access to control plane nodes (workload) and provider VM

Potential Data Loss Considerations

fsck can only recover what it can reconcile from the journal.

  • Aborted transactions not yet written to disk may be lost.
  • In worst-case scenarios, some recent unflushed writes could be lost if they hadn’t reached disk before the journal aborted.

Best Practices

1. Backups first

    * Always take a snapshot of the underlying volume (VMware snapshot or storage snapshot) before running fsck.

    * This lets you roll back if the repair causes unexpected corruption.

2. Use WAL + replication for PostgreSQL

    * Ensure all WAL segments are replicated to a standby before performing repairs.

    * This minimizes data loss if fsck discards uncommitted changes.

3. Perform during maintenance window

    * Draining the node and performing fsck should be done during planned maintenance to reduce production impact.

4. Post-repair validation

    * After fsck, check PostgreSQL logs for missing or corrupt files.

    * Run consistency checks (pg_checksums if enabled, or application-level sanity checks).

 

Recovery

1. SSH to a control plane node

Fetch the sshkey from provider VM:

 
root@xxx [ /opt/vmware/tdm-provider/provisioner ]# ls config.yaml  dsm-tsql-config-from-pg.yaml  registry-credentials-auth  sshkey  sshkey.pub

SSH to a control plane node:

 
# SSH to control plane node ssh -i sshkey capv@<CONTROL_PLANE_IP> # Switch to sudo sudo su

2. Identify affected volumes

List mounted volumes to confirm the PostgreSQL data disk:

 
mount | grep sdc

Check kernel messages for journal errors:

 
dmesg | tail -n 50 | grep sdc

Confirm write failures:

 
touch /path/to/mount/testfile

Expected: fails if filesystem is read-only.

3. Drain the node

 
kubectl drain <NODE_NAME> --ignore-daemonsets --delete-emptydir-data --kubeconfig=<KUBECONFIG>

Wait for all pods to be evicted; ignore DaemonSet-managed pods.

Note: If kubectl commands are hanging or failing, that means the node that hosted the api-server and/or etcd has become read-only. This increases the cluster state risk as if the corruption affects etcd data directories, fsck might delete corrupted write-ahead logs (WAL). If this is a single-node control plane or if quorum is lost, the cluster state may be unrecoverable without an etcd snapshot restore. In that case, data loss risk for in-flight API requests. Proceed with the following steps, skipping kubectl commands only if data loss is acceptable. Otherwise contact DSM support for manual intervention. 

4. Stop kubelet

 
sudo systemctl stop kubelet

Prevents automatic remounts by CSI or kubelet while repairing the disk.

Note: If you could not perform Step 3 (Drain) because kubectl is not available, you must also stop the container runtime to release disk locks:

 
sudo systemctl stop containerd

5. Unmount affected volume

 
mount | grep sdc # If mounted, unmount explicitly: sudo umount /dev/sdc

6. Run filesystem check and repair

 
sudo fsck.ext4 -f -y /dev/sdc
  • -f forces check even if filesystem is clean.

  • -y auto-confirms fixes.

  • Typical output may include:

 
Free blocks count wrong (XXXXX, counted=YYYYY). Fix? yes Free inodes count wrong (XXXXX, counted=YYYYY). Fix? yes

Ensure fsck completes successfully and reports:

 
FILE SYSTEM WAS MODIFIED

7. Restart the VM

Reboot is required to clear any zombie processes and ensure the kernel re-reads the partition table cleanly.

 
sudo reboot

8. Uncordon the node

 
kubectl uncordon <NODE_NAME> --kubeconfig=<KUBECONFIG>

After these steps, trigger any reconcile on the PostgreSQL and it should become Ready.