This issue is applicable to all supported versions of both
VMware Tanzu GemFire and VMware GemFire.remove
/removeAll
would fail to delete some entries from a region with a custom partition resolver.
There is no key class. The key is generated, using some combination of the entry fields. For example: each entry has fields like “foo”, “bar
”, “baz
”, “id
”, etc. The key is being created from a composite of two of the entry fields. The partition resolver code, extracts the “foo
” from the key and returns it in the getRoutingObject
method.
Error messages like the following, which indicate that there may be an issue with the bucket number generation, are found in the logs:
ERROR! The sequence number “x,xxx,xxx,xxx”, generated for the bucket “yy” is incorrect.
Removing an entry, matching the relevant condition fails randomly.
The problem is caused by using a custom PartitionResolver
, which generates routing objects from instance fields.
The best fix is to remove the fields “id
” and “foo
” from the Custom partition resolver code. When creating the routing object, you should only use fields on the key.
Do not use the value or additional metadata in the PartitionResolver
code. Make sure not to set the data fields and then use those data fields to return the routing object, because the setting and getting operations are not thread-safe
. Thread-1
may set id and “foo”,
thread-2
sets id and “foo
”, then thread-1
returns the “foo
” that was set by thread-2
. When this thread collision occurs the region entry being populated ends up with a null value because it was routed to the wrong bucket.
References: