Cyclic "Unable To Contact" messages on CAPC for the Data Aggregator

book

Article ID: 103520

calendar_today

Updated On:

Products

CA Infrastructure Management CA Infrastructure Management CA Performance Management - Usage and Administration CA Performance Management - Data Polling

Issue/Introduction

We patched the RHEL OS on our CAPM stack today and found that it was in "System Health::failed" for an extended period of time. During this time, the DA showed as "Unable to Connect" and the Collectors link was gone under the Administration menu. We noticed that when the DA did showed as available that the collectors were red and multiple syncs failed. We decided to after restarting the whole stack to bring each component up individually, but still the stack did not stabilize in a timely manner. We are creating this ticket to have the CAPM stack restart process investigated and to determine if this was out of the norm or if it can be shortened to minimize downtime.

Cause

One node left the cluster during start-up and this slowed things down considerably.

Environment

CAPM 3.5

Resolution

Restarting services fixed this problem, but likely allowing more time for the startup to finish would have solved this as well.