OpenShift Container using NFS datastore reports Warning: couldn't configure logger. error: [Errno 30] Read-only file system: '/var/opt/redislabs/log/rladmin.log'
search cancel

OpenShift Container using NFS datastore reports Warning: couldn't configure logger. error: [Errno 30] Read-only file system: '/var/opt/redislabs/log/rladmin.log'

book

Article ID: 392432

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Databases are not accessible and out of sync due to IO error and partition is in read only mode. "Read Only file system" are reported on guest OS
Warning: couldn't configure logger. error: [Errno 30] Read-only file system:'
/var/opt/redislabs/log/rladmin.log'
  • OpenShift Container Platform clusters provisioned with persistent storage using NFS is impacted with backend Nutanix storage.
  • Virtual machine config files residing on NFS datastore may not report any issues and still have read-write access.

  • NFS datastore undergoes APD "All paths Down" and this is observed in /var/run/log/vmkernel.log or /var/run/log/vobd.log

YYYY-MM-DDTHH:MIN:SEC [APDCorrelator] 11740531562078us: [vob.storage.apd.start] Device or filesystem with identifier [naa.####################] has entered the All Paths Down state.
YYYY-MM-DDTHH:MIN:SEC [APDCorrelator] 11740553056140us: [esx.problem.storage.apd.start] Device or filesystem with identifier [naa.####################] has entered the All Paths Down state.
  • Filesystem timeout observed in /var/run/log/vmkernel.log
YYYY-MM-DDTHH:MIN:SEC cpu13:2097806)NFS: 6284: Status:File system timeout (Ok to retry). Retrying synchronous write I/O 13 of 25 times
 
  • RPC cancel & I/O errors are observed in /var/run/log/vmkernel.log
YYYY-MM-DDTHH:MIN:SEC cpu13:5303853)SunRPC: 3291: Synchronous RPC cancel for client 0x431217e02690 IP  ###.##.##.#. .8.1 proc 1 xid 0xa3c30b36 attempt 3 of 3
YYYY-MM-DDTHH:MIN:SEC cpu13:5303853)WARNING: NFS: 2581: Failed to get attributes (I/O error)
 

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

This issue is caused due to the disconnections on NFS datastore mounted as persistent volume (PV) in OpenShift Container Platform. 
 
Uplink port (vmnic) used for NFS traffic may go down.
 
YYYY-MM-DDTHH:MIN:SEC: [netCorrelator] 12638715658481us: [vob.net.pg.uplink.transition.down] Uplink: vmnic2 is down. Affected portgroup: VM Network. 1 uplinks up. Failed criteria: 128
YYYY-MM-DDTHH:MIN:SEC: [netCorrelator] 12638715658486us: [vob.net.dvport.uplink.transition.down] Uplink: vmnic3 is down. Affected dvPort: 328/50 09 02 de fa 93 93 e3-d3 a1 a5 a3 2d 53 55 a2. 1 uplinks up. Failed criteria: 128
YYYY-MM-DDTHH:MIN:SEC: [netCorrelator] 12638715658537us: [vob.net.vmnic.linkstate.down] vmnic vmnic2 linkstate down.
 
 
Though NFS datastore undergoes APD and volumes are restored without any disruptions to other applications we may observe this issue on specific application which could be IO sensitive.

Resolution

Engage the network team or vendor to further investigate the issue.

Additional Information