GemFire WAN GatewaySender Fails to Connect with GatewayConfigurationException: Remote WAN site’s distributed system id matches this site’s distributed system id
book
Article ID: 408160
calendar_today
Updated On:
Products
VMware Tanzu Greenplum / Gemfire
Issue/Introduction
In a multi-site GemFire cluster, some cache servers may fail to start or WAN GatewaySenders may fail to connect. The following error is observed in the logs:
org.apache.geode.cache.GatewayConfigurationException:
Remote WAN site's distributed system id <id> matches this site's distributed system id <id>
Cause
This error occurs when a site attempts to connect to a remote WAN site that reports the same distributed-system-id (DSID) as the local site. WAN replication requires each site to have a unique DSID.
The most common cause is stale or incorrect DSID metadata in locators, which can persist if:
A cluster was started with the wrong distributed-system-id and later corrected.
Locators cached and propagated the incorrect mapping.
Some servers contact locators that still hold the polluted DSID mapping, while others connect through clean locators.
As a result, only part of the cluster may fail to join, depending on which locator is used for discovery.
Resolution
To resolve the issue, the stale DSID mapping needs to be cleared from the cluster locators. And the following steps may need to be taken
Stop WAN gateway senders/receivers on both sites.
Restart all locators in both the primary and secondary sites to flush incorrect DSID mappings.
Restart affected cache servers in the secondary site.
Once all members are running with the correct distributed-system-id, restart WAN gateway senders/receivers.