This issue may be related to corrupted Cassandra commit logs. Cassandra fails to restart successfully and the watchdog service on the ISVM VMs after failing to restart Cassandra for 16 times, reboots the ISVM VMs.
To work around this issue:
- Stop the watchdog service on the ISVM VM by running this command:
service ism-watchdog stop
- Inspect the Cassandra logs /opt/vmware/cassandra/apache-cassandra-2.2.4/logs/debug.log and identify the offending commit log file name which is corrupted.
- If any corrupted commit logs are noted in Step 2, remove them from the /opt/vmware/ism/logs folder using the rm command.
- Restart Cassandra by running this command:
service cassandraserver start
- Repeat the process until all corrupted commit logs are deleted. There is no automation of remediating a commit log corruption failure.
Note: If the service does not start successfully, look for more commit log failures and remove the offending commit logs.