Several Micro Services do not get to healthy state
search cancel

Several Micro Services do not get to healthy state

book

Article ID: 75928

calendar_today

Updated On:

Products

Mainframe Operational Intelligence

Issue/Introduction

Several Micro Services do not get to healthy state

When trying to reconnect the the Message Server on the MSC demo z/OS system MVSXXXX the receiving MOI system USERID-MOI was not accessible displaying a message on vSphere that there is an issue with a full disk volume. This was fixed by our colleagues running this VMware system - but now many microservices stay in status unhealthy:

Environment

Mainframe Operational  Intelligence 2.0

Resolution

Cassandra commit log corruption is the cause of the unhealthy state of the USERID-MOI (###.###.##.##) appliance. Cassandra cannot start up so any of the other microservices that are dependent on Cassandra will show a status of unhealthy until Cassandra is up and healthy. Using the Cassandra node logs, we identified the following corrupt log files that need to be deleted and then Cassandra needs to be brought up. It may well be that it will fail again due to more log corruption. In that case we will have to remove more commit log files and retry bringing Cassandra up. We don’t have a root cause of the corruption that is occurring with the commit logs. Please note that none of the Cassandra nodetool commands can be used until Cassandra is up and healthy. cd 

Additional Information

There were four corrupted Cassandra commit logs. I removed all four of the files and now all containers are showing as healthy with Cassandra healthy for the last 30+ minutes.