AIOps - Unable to login, "502 Bad Gateway" error after adding a 2nd nic

book

Article ID: 219304

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management CA App Experience Analytics

Issue/Introduction

We added a 2nd nic to the k8s master server: 

- one nic for k8s communication
- one nic for management

after adding the 2nd nic we are no longer able to open the DX Admin login page, we get "502 Bad Gateway" error message

we tried restarting dxi using <dx-platform>/tools/dx-admin.sh stop/start but it didn't help

we can see many axa-services pods are unable to startup and remain in "Init" status

Cause

This is not a Broadcom issue, instead an Nginx, calico or flannel configuration issue.

Calico cni node pods are using the new interface eth1 instead of the eth0 due its ip autodetect mode.

Below are two lines that tell calico to use a different interface:

- name: IP_AUTODETECTION_METHOD
  value: "interface=eth0"

For more information refer to https://docs.projectcalico.org/reference/node/configuration

Environment

DX Operational Intelligence 20.x
DX Application Performance Management 20.x
DX AXA 20.x

Resolution

1. Login to k8s master

2. Edit calico-node Daemonset:

kubectl -n kube-system edit ds calico-node


3. Add below property:


- name: IP_AUTODETECTION_METHOD
  value: "interface=eth0"


As below:

4. Verification: 


a) Check all the  pods in kube-system are running:


kubectl get pods -nkube-system

NAME                                                   READY   STATUS    RESTARTS   AGE
calico-kube-controllers-7f4f5bf95d-hh4bn               1/1     Running   0          102m
calico-node-5xzw7                                      1/1     Running   0          95m
calico-node-nrnff                                      1/1     Running   0          96m
calico-node-qv9st                                      1/1     Running   0          96m
calico-node-wj4ch                                      1/1     Running   0          96m
calico-node-wm4bb                                      1/1     Running   0          96m
coredns-f9fd979d6-p92cw                                1/1     Running   0          133m
coredns-f9fd979d6-vs66l                                1/1     Running   0          133m
etcd-munqa001499.bpc.broadcom.net                      1/1     Running   0          133m
kube-apiserver-munqa001499.bpc.broadcom.net            1/1     Running   0          84m
kube-controller-manager-munqa001499.bpc.broadcom.net   1/1     Running   1          133m
kube-proxy-5t2qg                                       1/1     Running   0          101m
kube-proxy-7g762                                       1/1     Running   0          101m
kube-proxy-97qx4                                       1/1     Running   0          101m
kube-proxy-p8kdb                                       1/1     Running   0          101m
kube-proxy-r7d5k                                       1/1     Running   0          133m
kube-scheduler-munqa001499.bpc.broadcom.net            1/1     Running   1          133m


b) check each of  the calico-node-xyzzyx pod report the below message.

kubectl logs calico-node-wm4bb -nkube-system | grep "Using autodet"

2021-07-07 20:09:05.084 [INFO][9] startup/startup.go 788: Using autodetected IPv4 address 10.109.32.164/21 on matching interface eth0

 

c) check ingress endpoint is working


In this example:

curl 10.109.32.88.nip.io

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.19.1</center>
</body>
</html>


NOTE: when you use curl on port 80 i.e http , it usually returns  404 Not Found , that means your ingress is working fine. 


curl 10.109.32.88.nip.io:443

<html>
<head><title>400 The plain HTTP request was sent to HTTPS port</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<center>The plain HTTP request was sent to HTTPS port</center>
<hr><center>nginx/1.19.1</center>
</body>
</html>

Additional Information

DX AIOPs Troubleshooting, Common Issues and Best Practices
https://knowledge.broadcom.com/external/article/190815

Attachments