Ramdisk /var Partition Full and No vSAN Traces Available on ESXi Host
search cancel

Ramdisk /var Partition Full and No vSAN Traces Available on ESXi Host

book

Article ID: 403921

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms : 

  •  No log files generated in the path /var/log/vsantraces or /vsantraces
  •  vdf -h reports 0% used for vsantraces  but ramdisk /var was showing as full  

 
      Ramdisk     1k-blocks   Used    Available Use% Mounted on
      root           32768    29M     22800     8%     --
      etc            28672    232K    26236    0%      --
      opt            49152    0B      32768     0%     --
      var            3345     3345     0       100%    --
      tmp            263347   8972    263347    4%     --
      iofilters      1048576    0     1048576   0%     --
      shm             12727     0      12727    0%     --
      vsantraces     307200     0      307200   4%     --
      portlldpd-dmp   153600    0      153600    0%    --
      sph-initrd      163840  159736   4104     87%    --


   vsantraced service unable to start and fails with below error  

    • [root@esxi:/vsantraces] /etc/init.d/vsantraced start
      Setting vsantraced CPU reservation to 0
      User selected volume /var/log/vsantraces/vsantraces does not exist
      Setting file size to 16 MB to fit on ramdisk
      Setting urgent file size to 8 MB to fit on ramdisk
      Setting lsom file size to 2 MB to fit on ramdisk
      Setting lsom verbose file size to  2 MB to fit on ramdisk
      Setting plog file size to  1 MB to fit on ramdisk
      Setting dom object file size to 1 MB to fit on ramdisk
      Setting object diagnositic IO file size to  1 MB  to fit on ramdisk
      Updated the config file to VSANTRACED_LAST_SELECTED_VOLUME=/vsantraces
      Storing traces to /vsantraces
      VSAN traces ramdisk is not empty, not restoring from persistent volume
      Possibly stale VSAN traces remain on VSAN traces persistent volume /scratch; all will be overwritten on next shutdown
      vsantraced: freeVolMB is AvailableM, traceMB is 4M
      Updated the config file to VSANOBSERVER_MAX_MB_SIZE=10
      vsanreaderd started
      vsantraced started
      vsantracedUrgen started
      vsantracedDiag started
      vsantracedLSOM started
      vsantracedLSOMV started
      vsantracedPLOG started
      Failed to start dom object vsantraced: 1
      vsantraced stopped
      vsantracedUrgen stopped
      vsantracedLSOM is not running
      vsantracedLSOMV stopped
      vsantracedPLOG stopped
      vsantracedDOMOb is not running
      vsantracedDiag is not running
      vsanreaderd stopped
      Persisting traces to /scratch/vsantraces
      Errors:
      Can not delete non-empty group: vsantraced
      Failed to clear vsantraced memory reservation
      [root@mophhqdesxi:/vsantraces]
  • Validate the vmkernel.log and check if multiple processes are running for vsantracedLSOMV

    Log path :  less /var/run/log/vmkernel.log

    2025-07-01T06:24:04.046Z In(182) vmkernel: cpu91:#####)FSS: 7391: Failed to open file 'vsanTracesLSOM'; Requested flags 0x1, world: ####[vsantracedLSOM], (Existing flags 0x1, world: 105618412 [vsantracedLSOM]): Busy
    'vsanTracesLSOMVerbose'; Requested flags 0x1, world: ####[vsantracedLSOMV], (Existing flags 0x1, world: 105621580 [vsantracedLSOMV]): Busy
    2025-07-01T06:24:04.149Z In(182) vmkernel: cpu95:105642926)FSS: 7391: Failed to open file 'vsanTracesObjectDiagIO'; Requested flags 0x1, world: #####[vsantracedDiag], (Existing flags 0x1, world: 105617813 [vsantracedDiag]): Busy

    2025-07-01T06:22:52.464Z Wa(180) vmkernel: cpu49:105642261)WARNING: Trace chardev already open!
    2025-07-01T06:22:52.468Z Wa(180) vmkernel: cpu2:105642263)WARNING: dy open!Trace chardev already open
     
  • Validate hostd.log for var partition full events 

    Log path :  less /var/run/log/hostd.log

    2025-07-01T05:30:04.232Z In(166) Hostd[10###3]: -->          value = "/var/run/vmware/watchdog/.vsantracedDiag.Vb1DBQ"
    2025-07-01T05:30:04.232Z In(166) Hostd[10##43]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 1015737 : The ramdisk 'var' is full.  As a result, the file /var/run/vmware/watchdog/.vsantracedDiag.Vb1DBQ could not be written.

Environment

VMware vSAN 8.x 

VMware VSphere Esxi 8.x

Cause

The issue arises when multiple processes are running simultaneously, leading to the vsantraces folder not being updated. This is caused by the ramdisk becoming full due to continuous log entries from vsantraced. If a scratch location is not configured for storing vsantraces files, the trace data is instead written to the ramdisk. Over time, as the var partition fills up, it can prevent the vsantraced service from starting properly.

In some cases, specific vsantrace entries may be logged too frequently, which can exhaust the storage capacity of the volume holding the trace files. As a result, the trace files are not updated in the vsantraces location, causing interruption in starting vsantraced service

Resolution

1.  Create a folder vsantraces in local datastore 

    mkdir /vmfs/volumes/Datastore/scratch/local datatsore/vsantraces/


   Set log files using below commands 

          esxcli vsan trace set -f 20  -s 50 -p /vmfs/volumes/Datastore/local datatsore/vsantraces 

 If any old log file entries remove using below
 Navigate to the vsantraces folder by running cd var/log/vsantraces
 List the contents of the folder by running ls -ahl
 
To delete older files, run the following:

     rm vsantraces*2023-05*.gz
    rm vsanObserver--2023-05*.gz
    rm vsantracesUrgent--2023-05*.gz
 
2. Validate the vsantraced service by restarting using command:"/etc/init.d/vsantraced restart"
 
3. Restart the vsanmgmtd service using:"/etc/init.d/vsanmgmtd restart", which fix the vsantraced service status.
 
4. Revert the vSAN trace values as mentioned in step 2, then validate the status of the vSAN trace service and vsanmgmtd. Both services should be running successfully.
 
5. if about steps does not start vsantrcaed service Reboot esxi host by following best practices mentioned in below document   to update correct location of vsantraces as multiple process were running for the same.