1. Automate GemFire Rebalance After Restart (High Impact, Low Effort)
- Implementation:
- Use the gfsh>rebalance --simulate=true command to analyze if regions are out of balance. A non-zero "Total number of buckets moved during this rebalance" indicates a rebalance is needed.
- If required, execute the actual rebalance using a command like rebalance --include-region=profilecore.
- After the rebalance, a simulation should show that the total bytes in buckets moved is zero.
- Warning: Do not execute a rebalance if the cluster does not have a majority number of members, as this can cause out-of-memory errors for running servers.
2. Decrease Load-Conditional-Interval Connection Pool (High Impact, Low Effort)
The load-conditioning-interval determines how often the client checks with the locator to ensure it is using the least loaded server. The recommendation is to decrease this interval from the default of 5 minutes.
- Implementation: Set the load-conditioning-interval to a range between 30 seconds and 3 minutes.
- Side Effect: Be aware that decreasing this value will increase the number of requests to the GemFire locators and can sometimes spike the locator cpu
3. Upgrade GemFire Client Library (High Impact, Medium Effort)
Upgrade the client libraries to the same version as the server, which would allow the client to use new features or optimizations in Gemfire. For example, max-connections-per-server and min-connections-per-server.