Startup Failure Due to .lk File Lock Conflicts When Using Both cache.xml and Cluster Configuration in VMWare GemFire
search cancel

Startup Failure Due to .lk File Lock Conflicts When Using Both cache.xml and Cluster Configuration in VMWare GemFire

book

Article ID: 405627

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

Sometimes member startup is failed post some maintenance jobs such as offline compaction. A typical error seen in logs is:

 
Cache initialization for GemFireCache[...] failed because:
org.apache.geode.cache.DiskAccessException: org.apache.geode.cache.DiskAccessException: For DiskStore: 0_xxx_DISK_STORE: Could not lock xx/DRLK_IF0_xxx_DISK_STORE.lk. Other JVMs might have created diskstore with same name using the same directory., caused by java.io.IOException: The file /xx_DISK_STORE.lk is being used by another process., caused by org.apache.geode.cache.DiskAccessException: 
	.........
Caused by: org.apache.geode.cache.DiskAccessException: For DiskStore: 0_xxx_DISK_STORE: Could not lock /xxx_DISK_STORE.lk. Other JVMs might have created diskstore with same name using the same directory., caused by java.io.IOException: 
..........

Caused by: java.io.IOException: The file xxx_DISK_STORE.lk is being used by another process.
	at gemfire//org.apache.geode.internal.cache.DiskStoreImpl.createLockFile(DiskStoreImpl.java:1873)
	... 17 more

This is often observed in clusters configured with both cache.xml and cluster configuration service . Members fail to start due to .lk file conflicts related to disk store definitions. 

Cause

This issue is generally caused by:

  1. Disk Store Name Conflicts
    Multiple members using the same disk store name (e.g., xxx_DISK_STORE) in cache.xml or cluster configuration can cause lock file collisions during startup.

  2. Residual .lk Files
    Forceful shutdowns (e.g., kill -9) may leave behind undeleted .lk files, which GemFire treats as in use by another process.

  3. Overlapping Configurations
    Defining the same disk store or region in both cache.xml and cluster configuration can result in duplicate creation attempts and startup failures.

Resolution

To resolve and prevent DiskAccessException:

  1. Ensure each member defines a uniquely named disk store (e.g., member_xx_DISK_STORE) even if directories differ.

  2. Don’t define the same disk store or region in both cache.xml and cluster configuration. 

  3. After confirming all members are fully shut down (no Java processes running), remove any leftover .lk files:

     
    find /path/to/diskstores -name "*.lk" -exec rm -f {} \;
  4. Use either cache.xml or cluster configuration for defining disk stores and regions—not both.

Additional Information