Although the
gemfire-locator job on locator VMs will appear to be running consistently and
gemfire-server jobs will appear to be switching between starting, running and failing, you don't need to touch the GemFire servers.
The get around this problem we need to
monit stop
all the
gemfire-locator jobs on the locator VMs and start them one by one.
Note: This does not mean you can use monit restart
. At some point, it is required that none of the gemfire-locator jobs are running.
Follow the below instructions:
1. Stop the
gemfire-locator jobs on each locator VM. You can reference each locator using 0 based index. For example, locator/0, locator/1 etc.
bosh -e ENV -d DEPLOYMENT ssh locator/0
sudo su
monit stop gemfire-locator
2. Verify that all the
gemfire-locator jobs in all locator VMs are actually stoped by running
bosh -e ENV -d DEPLOYMENT instances --ps
. All the gemfire-locator jobs should be in stopped state.
3. Now start the
gemfire-locator jobs on each locator VM.
bosh -e ENV -d DEPLOYMENT ssh locator/0
sudo su
monit start gemfire-locator
After the above commands are executed, the flapping of
gemfire-servers should now stop and the servers should be in running state. Execute the command,
watch
bosh -e ENV -d DEPLOYMENT instances --ps
for a minute to make sure
gemfire-server jobs are not switching states.