Restarting Cache Server Throws a ConcurrentModificationException at Remote Member
search cancel

Restarting Cache Server Throws a ConcurrentModificationException at Remote Member

book

Article ID: 294217

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

Symptoms:

Let's see why the java.util.ConcurrentModificationException at Remote Member exception can be hit when restarting a GemFire cache server and what workarounds can be applied to resolve this issue.


When starting several cache servers at a time, cache server A throws a ConcurrentModificationException when initializing a region like shown in the below log file exert. Cache server A failed to start while the other cache servers started successfully. When cache server A has restarted again, it starts successfully.

info 2016/08/16 09:17:43.152 JST tid=0x1] Initializing region exampleRegionX [info 2016/08/16 09:17:43.219 JST tid=0x1] Region exampleRegionX requesting initial image from 192.168.1.10(27972):10344 [info 2016/08/16 09:17:43.260 JST tid=0x1] exampleRegionX failed to get image from 192.168.1.10(27972):10344 [warning 2016/08/16 09:17:43.264 JST tid=0x1] Initialization failed for Region /exampleRegionX com.gemstone.gemfire.ToDataException: toData failed on DataSerializer with id=0 for class class java.util.HashMap at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeUserObject(InternalDataSerializer.java:1482) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeWellKnownObject(InternalDataSerializer.java:1411) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2203) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeObject(DataSerializer.java:3179) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.util.BlobHelper.serializeTo(BlobHelper.java:65) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.AbstractRegionEntry.fillInValue(AbstractRegionEntry.java:342) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.chunkEntries(InitialImageOperation.java:1959) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.process(InitialImageOperation.java:1741) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:457) at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:692) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager$5$1.run(DistributionManager.java:1000) at Remote Member '192.168.1.10(27972):10344' in java.lang.Thread.run(Thread.java:745) at com.gemstone.gemfire.distributed.internal.ReplyException.handleAsUnexpected(ReplyException.java:75) at com.gemstone.gemfire.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:525) at com.gemstone.gemfire.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1421) at com.gemstone.gemfire.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1209) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:2983) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2880) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2869) at com.gemstone.gemfire.cache.RegionFactory.create(RegionFactory.java:841) at com.customer.framework.cache.impl.gemfire.CacheServerCacheManager.afterConnect(CacheServerCacheManager.java:147) at com.customer.framework.cache.impl.gemfire.GemFireCacheManager.(GemFireCacheManager.java:118) at com.customer.framework.cache.impl.gemfire.CacheServerCacheManager.(CacheServerCacheManager.java:95) at com.customer.framework.cache.impl.gemfire.GemFireCacheManager.doInit(GemFireCacheManager.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.customer.framework.cache.CacheManager.init(CacheManager.java:83) at com.customer.framework.process.Server.start(Server.java:139) at com.customer.framework.process.Server.execute(Server.java:104) at com.customer.framework.process.CacheServer.main(CacheServer.java:31) Caused by: java.util.ConcurrentModificationException at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$HashIterator.nextNode(HashMap.java:1429) at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$EntryIterator.next(HashMap.java:1463) at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$EntryIterator.next(HashMap.java:1461) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeHashMap(DataSerializer.java:2603) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer$32.toData(InternalDataSerializer.java:508) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeUserObject(InternalDataSerializer.java:1451) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeWellKnownObject(InternalDataSerializer.java:1411) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2203) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeObject(DataSerializer.java:3179) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.util.BlobHelper.serializeTo(BlobHelper.java:65) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.AbstractRegionEntry.fillInValue(AbstractRegionEntry.java:342) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.chunkEntries(InitialImageOperation.java:1959) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.process(InitialImageOperation.java:1741) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:457) at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:692) at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager$5$1.run(DistributionManager.java:1000) at Remote Member '192.168.1.10(27972):10344' in java.lang.Thread.run(Thread.java:745) [info 2016/08/16 09:17:43.358 JST tid=0xe] VM is exiting - shutting down distributed system

Environment


Cause

"java.util.ConcurrentModificationException" is a common exception when working with java collection classes such as the Hashmap class. Generally, the ConcurrentModificationException can be thrown in case of multiple threads as well as a single thread in the Java programming environment such as, when a Collection is changed by one thread while another thread is traversing over it using iterator then iterator.next.


In the case of the above stack, the failed node was trying to get an initial image (GII) from Remote Member '192.168.1.10(27972):10344 and it threw java.util.ConcurrentModificationException from java.util.HashMap$HashIterator.nextNode when iterating the Hashmap containing the region entries object, whereas, the other thread was changing the Hashmap because of an add/put/destroy/invalidate operation.

Resolution

The ConcurrentModificationException is an expected exception in the described situation. To avoid this exception and the related issues when starting cache servers, the following could be applied:


Solution 1

Enable the copy-on-read parameter: 

Using cache.xml:

<cache copy-on-read="true">

Using the GemFire Java API:

Cache c = CacheFactory.getInstance(system) c.setCopyOnRead(true);

You can find more details in the GemFire User's Guide here.


Solution 2

Changing the cache servers start order can also resolve this issue.