VMware GemFire data colocation issue due to using unsupported region sequence in a transaction
search cancel

VMware GemFire data colocation issue due to using unsupported region sequence in a transaction

book

Article ID: 294334

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

The VMware GemFire Transaction will fail with either a TransactionDataRebalancedException or a TransactionDataNotColocatedException, depending on the VMware GemFire version.


This is applicable to all versions of VMware GemFire. Here is an example stack trace, for TransactionDataRebalancedException. In the upcoming versions of VMware GemFire, the TransactionDataRebalancedException will be replaced by TransactionDataNotColocatedException.
org.apache.geode.cache.TransactionDataRebalancedException: Transactional data moved, due to rebalancing.
        at org.apache.geode.internal.cache.TXStateProxyImpl.getTransactionException(TXStateProxyImpl.java:254) ~[geode-core-9.5.2.jar:?]
        at org.apache.geode.internal.cache.TXStateProxyImpl.findObject(TXStateProxyImpl.java:543) ~[geode-core-9.5.2.jar:?]
        at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1380) ~[geode-core-9.5.2.jar:?]
        at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1314) ~[geode-core-9.5.2.jar:?]
        at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1299) ~[geode-core-9.5.2.jar:?]
        at org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:408) ~[geode-core-9.5.2.jar:?]


Environment

Product Version: 9.5

Resolution

The data colocation issue can potentially happen if the transaction is on a replicate region which is followed by a transaction on a partitioned region.

Below are some scenarios:

1. If all the regions are replicated regions in a transaction, there will be no problem.

2. If the replicate region is followed by the partitioned region in a transaction, it may work sometimes and fail other times (depending on which data node the first operation is landed, the subsequent operations, and also where the primary bucket is located).

3. If the partitioned region is followed by the replicated region and COLOCATED partitioned regions, then the transaction will work.

4. If the partitioned region is followed by a non-collocated partitioned region, then the transaction will fail (technically, it behaves the same as case 2 - sometimes it works, sometimes it fails - depending on the primary bucket location). If it is not a supported scenario, the user should avoid it.
 

In VMware GemFire, only the first and third cases are supported. Technically, users should not work on replicated regions first in a transaction. 

Below is a workaround to make case 2 work:

Make sure that the Transaction is starting on the correct node.
Please do not use query on the partition region to start a transaction, it may not start the transaction on the correct node as the query is not under transactional context in

VMware GemFire. Instead, use "region.get" on the partitioned region to start the transaction.

Note: Using "region.get" guarantees the start of the transaction on the correct node, if the transaction only touches colocated partitioned regions and replicate regions afterwards.