$ /usr/local/bin/kubectl get all,replicationcontrollers,events --show-kind --all-namespaces --output wide
stdout:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/command-executor-##### 1/1 Running 27 195d
kube-system pod/coredns-##### 1/1 Running 27 195d
kube-system pod/health-reporting-app-##### 1/1 Running 27
kube-system pod/kube-flannel-ds-##### 1/1 Running 27 195d
kube-system pod/kube-node-monitor-##### 1/1 Running 27 195d
kube-system pod/kubelet-rubber-stamp-##### 1/1 Running 27 195d
kube-system pod/metrics-server-##### 1/1 Running 27 195d
kube-system pod/network-health-monitor-##### 1/1 Running 27 195d
kube-system pod/predictable-pod-scheduler-##### 1/1 Running 27 195d
kube-system pod/prelude-network-monitor-cron-########-##### 0/1 Completed 0 4m44s
kube-system pod/prelude-network-monitor-cron-########-##### 0/1 Completed 0 104s
kube-system pod/state-enforcement-cron-########-##### 0/1 Completed 0 5m44s
kube-system pod/state-enforcement-cron-########-##### 0/1 Completed 0 3m44s
kube-system pod/state-enforcement-cron-########-##### 0/1 Completed 0 104s
kube-system pod/update-etc-hosts-##### 1/1 Running 27 195d
prelude pod/vco-app-##########-##### 2/2 Running 1 (5d2h ago) 5d2h
RestExceptions" which are linked to the "SystemDefaultDNSResolver" from the logs in "/services-logs/prelude/vco-app/file-logs/vco-server-app.log":vco [host='vco-app-##########-#####' thread='WorkflowExecutorPool-Thread-#####' user='[email protected]' org='-' trace='-'] {|__SYSTEM|[email protected]:[PYTHON] # - ##### - ##### top-level:########-###-####-####-###########:token=########-####-####-####-############:anctoken=########-####-####-####-############} ch.dunes.vso.sdk.WrappedJavaMethod - Invocation exception during 'public com.vmware.o11n.plugin.rest.Response com.vmware.o11n.plugin.rest.Request.execute() throws com.vmware.o11n.plugin.rest.RESTException' call on object 'com.vmware.o11n.plugin.rest.Request@#######' java.lang.reflect.InvocationTargetException: null
....
Caused by: com.vmware.o11n.plugin.rest.RESTException: Cannot execute the request: ; server.example.com at com.vmware.o11n.plugin.rest.Request.handleException(Request.java:###) ~[o11nplugin-rest-model-#.#.#.jar:?] at com.vmware.o11n.plugin.rest.Request.execute(Request.java:###) ~[o11nplugin-rest-model-#.#.#.jar:?] ... ## more Caused by: java.net.UnknownHostException: server.example.com at java.net.InetAddress$CachedAddresses.get(InetAddress.java:###) ~[?:?] at java.net.InetAddress.getAllByName#(InetAddress.java:####) ~[?:?] at java.net.InetAddress.getAllByName(InetAddress.java:####) ~[?:?] at java.net.InetAddress.getAllByName(InetAddress.java:####) ~[?:?] at org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:##) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:##) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:###) ~[httpclient-#.#.##.jar:#.#.##] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:##) ~[httpclient-#.#.##.jar:#.#.##] at com.vmware.o11n.plugin.rest.RequestExecutor.execute(RequestExecutor.java:###) ~[o11nplugin-rest-model-#.#.#.jar:?] at com.vmware.o11n.plugin.rest.CustomContextRequestExecutor.execute(CustomContextRequestExecutor.java:##) ~[o11nplugin-rest-model-#.#.#.jar:?] at com.vmware.o11n.plugin.rest.Request.doExecute(Request.java:###) ~[o11nplugin-rest-model-#.#.#.jar:?] at com.vmware.o11n.plugin.rest.Request.execute(Request.java:###) ~[o11nplugin-rest-model-#.#.#.jar:?] ... ## more
/etc/resolv.conf" using the CoreDNS Kubernetes Service (as part of the "kube-system" namespace):
/services-logs/kube-system/kube-dns/console-logs/coredns.log"[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # server.example.com. A: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # server.example.com. A: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # server.example.com. A: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # server.example.com. A: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
[ERROR] plugin/errors: # contour. AAAA: read udp ##.###.#.###:#####->##.#.#.#:##: i/o timeout
With further analysis we saw issues within the CoreDNS which is responsible for communication from the Internal Kubernetes Pod Network communicating and resolving DNS for internal and external addresses.
We saw that on many occasions within the specified timeframes that for various pods/services we could see "i/o timeout" errors in communicating with the DNS Server (as found in "/etc/resolv.conf") for services like Contour (Kubernetes Internal Proxy) and the desired REST enabled server which the user is trying to contact.
nslookup" against a known address within the same network..