After waiting for some amount of time (5-10 mins) depending upon the size of the workload cluster, following error is continuously reported when trying to fetch metrics for pods or nodes:
"error: metrics not available yet"
"error: Metrics not available for pod"
This issue may present the following symptoms:
"unable to fetch node metrics for node <node-name>: no metrics known for node"
"unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary", "dial tcp: lookup <node-name> on <node-IP>:53: no such host"
"x509: cannot validate certificate for <IP> because it doesn't contain any IP SANs"
Both the DNS resolution and the certificate validation issues (Symptoms #1 and #2) can be resolved by editing the metrics-server deployment and adding the following flags in the "args" property:
Using kubectl, edit the command below:
kubectl -n kube-system edit deployment metrics-server
Add the following two arguments and save the changes:
After the changes gets applied to the cluster, wait for 2-3 mins for the metrics to be fetched. You can check if metrics-server is working by trying to get the metrics for nodes or pods using kubectl top node
or kubectl top pods
command. A successful output will be something similar to this: