DX AIOps - Unable to login to OI console, all Openshift nodes appear in a NotReady status 

book

Article ID: 206000

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics CA Application Performance Management (APM / Wily / Introscope)

Issue/Introduction

Symptoms

- All Openshift nodes appear in a NotReady status 

oc get nodes

- No issue with memory and disk in the nodes, verification done using

df -h
free -g

3) Restarting openshift node.service return a timeout error

systemctl restart origin-node.service

Result:  Job for origin-node.service failed because a timeout was exceeded

4) Restarting openshift master api and controller returns an error

master-restart api
master-restart controllers

Result: Job for origin-master-api.service failed because a configured resource limit was exceeded.

 

Cause

Some certificates are not longer validate causing a TLS connectivity issue

Troubleshooting details:

1) Found some pods in openshift projects in CrashLoopBackOff status, in this case "master-etcd"

oc project kube-system
oc get pods

oc logs <master-etcd-pod>

result => Error from server: Get https://<masternode>:10250/containerLogs/kube-system/master-etcd-<masternode>/etcd: remote error: tls: internal error

oc get csr

result => found some certificates in pending status 

 

Environment

DX OPERATIONAL INTELLIGENCE - 1.3.2, 20.x
DX APPLICATION PERFORMANCE MANAGEMENT 20.x

Openshift 3.x Platform

Resolution

1. Login using

oc login -u system:admin

2. Manually approve all certificates

oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

3. Verify that nothing is pending

oc get csr

4. Restart etcd service

master-restart etcd

5. Verify openshift cluster connectivity, all nodes should be in Ready status now!

oc get nodes

 

Additional Information

DX OI - Troubleshooting, Common Issues and Best Practices
https://knowledge.broadcom.com/external/article/190815/dx-oi-troubleshooting-common-issues-and.html

Attachments