Troubleshooting and addressing accumulation of tombstones in a Platform Services Controller(embedded or external)
search cancel

Troubleshooting and addressing accumulation of tombstones in a Platform Services Controller(embedded or external)

book

Article ID: 318924

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
Multiple issues can occur if a Platform Services Controller has more than 100,000 tombstone entries, below this threshold the symptoms in this article are likely unrelated.

To determine the number of tombstone entries on a Platform Services Controller Appliance(embedded or external), run this command:

/opt/likewise/bin/ldapsearch -H ldap://PSC_FQDN -x -D "cn=administrator,cn=users,dc=vsphere,dc=local" -w 'password' -b "cn=Deleted Objects,dc=vsphere,dc=local" -s sub "" -e 1.2.840.113556.1.4.417 dn| perl -p00e 's/?\n //g' | grep '^dn' | wc -l

Notes:
  • If the default SSO domain name has been changed, the dc value will need to be updated to match your environment.
  • If all nodes in the environment are Windows, a PSC appliance will need to be deployed to complete these steps. The appliance does not have to be connected to the current SSO domain to run the command above against a remote node.
Tombstone removal is recommended if the above command returns more than 100,000 tombstone entries.  Above this threshold, you may experience one or more of these symptoms:
  • A Platform Services Controller or vCenter Server deployment takes several hours to deploy.
    • A Platform Services Controller deployment fails with the error:
Firstboot script execution error
Could not configure the Service Control Agent administration group
  • Repointing a vCenter Server to another Platform Services Controller with the cmsso-util script intermittently fails during move-services.
  • Intermittent CPU spikes (150-400%) for the vmdir process. 
    • The Appliance process name is vmdird.  
    • The Windows process name is VMWareDirectoryService.
  • Slow logins (~3-5 minutes) to the Web Client with a PSC or domain account.
  • vCenter Server Appliance Upgrade or Migrations fail due to space constraints.
This issue is due to extra space consumed by the tombstones within vmdir.  
  • vCenter Server Appliance Upgrade Fails during Stage 2 export with the error
"/storage/db/vmware-vmdir/data.mdb', '[Errno 28] No space left on device"
/var/log/firstboot/vmafd-firstboot.py_XXXX_stderr.log contains the following error message:
Error: [('/storage/db/cis-export-folder/vmafd/data/vmdir/data.mdb', '/storage/db/vmware-vmdir/data.mdb', '[Errno 28] No space left on device')]


Environment

VMware vCenter Server 6.0.x
VMware vCenter Server 7.0.2
VMware vCenter Server 6.5.x
VMware vCenter Server 7.0.1
VMware vCenter Server 7.0.3
VMware vCenter Server 7.0.x

Resolution

This is a known issue affecting vCenter Server 6.x.

vCenter Server 6.0 Update 3 and 6.5 Patch 1 includes new functionality to remove tombstone entries to resolve this issue. Tombstone entries cannot be removed prior to these releases. These patches include a code level change that gives permissions to the root account to remove tombstones. These permissions cannot be added without the code changes on prior releases.

Note: Tombstone removal does not replicate between Platform Services Controllers in a SSO domain and must be performed on all Platform Services Controllers.

Caution: Depending on the number of tombstone entries, removal can be very time consuming.  In some cases removal can take over 24 hours.  The Platform Services Controller will still be functional during tombstone removal.

To manually remove tombstones in vCenter Server 6.0 Update 3 and later

  1. Connect to the Platform Services Controller using SSH and root credentials.
  2. Run this command to enable access the Bash shell:
shell.set --enabled true
  1. Type shell and press Enter.
  2. Run this command to create the all.txt file:
/opt/likewise/bin/ldapsearch -H ldap://PSC_FQDN -x -D "cn=administrator,cn=users,dc=vsphere,dc=local" -w 'password' -b "cn=Deleted Objects,dc=vsphere,dc=local" -s sub "" -e 1.2.840.113556.1.4.417 dn | perl -p00e 's/?\n //g' | grep '^dn' > all.txt
  1. Run this command to remove all tombstone entries:
RULER="========"; while read line;do echo "$RULER$line$RULER"; time -p echo -e "$line\nchangetype: delete\n" | /opt/likewise/bin/ldapmodify -c -H ldap://PSC_FQDN -x -D "cn=administrator,cn=users,dc=vsphere,dc=local" -w 'password' > /dev/null 2>/dev/null; done < all.txt | tee result.txt

Note:
  • If the default SSO domain name has been changed, then the dc value will need to be changed to match your environment.
  • If all nodes in the environment are Windows, a PSC appliance will need to be deployed to complete these steps. The appliance does not have to be connected to the current SSO domain to run the command above against a remote node.
  1. To verify the tombstone entries have been removed, run this command:
/opt/likewise/bin/ldapsearch -H ldap://PSC_FQDN -x -D "cn=administrator,cn=users,dc=vsphere,dc=local" -w 'password' -b "cn=Deleted Objects,dc=vsphere,dc=local" -s sub "" -e 1.2.840.113556.1.4.417 dn| perl -p00e 's/?\n //g' | grep '^dn' | wc –l

Resolve vmdir white space issues during upgrade or migration by increasing the partition size

Removing tombstones with the steps above does not cleanup whitespace in the vmdir database.  This can cause vCenter Server Appliance upgrades and migrations to fail.  There are two options to proceed with an upgrade if /storage/db runs out of space due to tombstones.

  1. Increase the partition space of /storage/db to 20 GB on the destination vCenter Server Appliance before stage 2 of the upgrade.  For more information, see Increasing the disk space for the vCenter Server Appliance in vSphere 6.5, 6.7, 7.0 and 8.0
  2. Use the mdb_copy command to remove the white space in vmdir:
The mdb_copy command only reduces white space after tombstone removal. If tombstones can not be removed because the Platform Services Controller is not on vSphere Update 3 or 6.5 Patch 1 or later, the only option  is to increase the partition size of /storage/db in step 1.
 
Steps for Linux
  1. Download the attached 52387_Appliance_mdb_copy.gz.
Note:for 7.0 there is a new file mdb_copy
  1. Stop the vmdird service with this command:
service-control --stop vmdird
  1. Take a backup of the original /storage/db/vmware-vmdir/data.mdb file.
  2. Compact the data.mdb database with this command:
./mdb_copy -c /storage/db/vmware-vmdir/ destination directory
  1. Replace original data.mdb file with compacted one.
  2. Start the vmdird service with this command:
service-control --start vmdird

Note: To stop and start services on the vCenter Server Appliance, see Stopping, starting, or restarting VMware vCenter Server Appliance 6.x services
 
Steps for Windows
  1. Download the attached 52387_Windows_mdb_copy.gz.
  2. Stop the VMWareDirectoryService service. For more information, see How to stop, start, or restart vCenter Server 6.x services
  3. Take a backup of original C:\ProgramData\VMware\vCenterServer\data\vmdird/data.mdb file
  4. Run the tool compact db,
C:\>mdb_copy.exe -c C:\ProgramData\VMware\vCenterServer\data\vmdird destination directory
  1. Replace original data.mdb file with compacted one.
  2. From the Windows UI, determine the size of the compacted data.mdb  file and add 500 MB to that number and record for later.
  3. Open run program and type regedit.
  4. Navigate to:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\VMWareDirectoryService\Parameters
  1. Right-click in an open space and select New DWORD (32-bit) Value.
  2. Name the DWORD MaximumDbSizeMb.
  3. Right-click on the DWORD and select Modify...
  4. In the Value data filed enter the number determined in step 6.
For example, if the compacted data.mdb size is 1500 MB, then enter 2000 for this step.
  1. Start the VMWareDirectoryService service.



Additional Information

Tombstone FAQ

What are tombstones?
The vmdir process does not delete entries that have been removed, they are marked for deletion. These entries marked for deletion are known as tombstones.  

Where are tombstones located?
Tombstones are found in the vmdir database that is used by Single Sign-on.  Vmdir databases are found on Platform Services Controllers.  This includes external Platform Services Controllers and vCenter Server with an embedded Platform Services Controller.

What generates tombstones?
Tombstone entries in vmdir will normally be caused by repointing vCenter between PSC nodes or unregistering a vCenter or PSC with cmsso-util.  

What causes a large accumulation of tombstones?
The VSAN health service had a bug that caused frequent registration requests with Single Sign-On.  Each registration request generates four tombstones in the vmdir database.  This bug was introduced in vCenter Server 6.0 U2 and was resolved in vCenter Server 6.0 U3. 

Why didn't the vmdir database size (/storage/db/vmware-vmdir/data.mdb) decrease after removing tombstones? 
White space is never removed from the vmdir database.  If shrinking the vmdir database size is required, refer to the steps in the resolution section.

Do tombstones automatically delete over time?
Yes.  Automatic tombstone cleanup was added in vCenter Server 6.0 P06 and vCenter Server 6.5 U1.  Tombstone cleanup happens once a day and will remove any tombstones older than 45 days.  This tombstone removal will not clean up white space in the vmdir database.    

Is it possible to remove tombstones before vSphere 6.0 Update 3 and 6.5 Patch 1? 
No.  There is no workaround that allows tombstone removal.  The only option for tombstone removal is to upgrade to vSphere 6.0 Update 3 or 6.5 patch 1.  A code fix on these releases alters root level permissions which allows tombstone removal.  Performing the tombstone removal steps on any instance below these releases will result in no tombstones being removed.

Attachments

52387_Windows_mdb_copy.exe.gz get_app
52387_Appliance_mdb_copy.gz get_app