After waiting for some amount of time (5-10 mins) depending upon the size of the workload cluster, following error is continuously reported when trying to fetch metrics for pods or nodes:
"error: metrics not available yet""error: Metrics not available for pod"
This issue may present the following symptoms:
"unable to fetch node metrics for node <node-name>: no metrics known for node""unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary", "dial tcp: lookup <node-name> on <node-IP>:53: no such host""x509: cannot validate certificate for <IP> because it doesn't contain any IP SANs"Both the DNS resolution and the certificate validation issues (Symptoms #1 and #2) can be resolved by editing the metrics-server deployment and adding the following flags in the "args" property:
Using kubectl, edit the command below:
kubectl -n kube-system edit deployment metrics-server
Add the following two arguments and save the changes:
After the changes gets applied to the cluster, wait for 2-3 mins for the metrics to be fetched. You can check if metrics-server is working by trying to get the metrics for nodes or pods using kubectl top node or kubectl top pods command. A successful output will be something similar to this: