kubernetes 1.18.x metrics-server 采集不到数据
Kubernetes 1.18.x版本部署metrics-server组件时,采集不到数据
表现为:
kubectl top nodes
,kubectl top nodes
报错信息如下Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
查看kube-apiserver的日志信息kubectl -n kube-system logs -f kube-apiserver-master-1 --tail 10
E0826 04:25:10.111976 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:25:15.112635 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0826 04:25:37.762783 1 handler_proxy.go:102] no RequestInfo found in the context
E0826 04:25:37.762890 1 controller.go:114] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0826 04:25:37.762915 1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0826 04:25:41.763211 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:25:46.764318 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:26:11.763745 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:26:16.764339 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:26:41.763920 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0826 04:26:46.764574 1 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: Get https://10.108.243.54:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
这段日志看不出什么问题。
最终解决:kube-apiserver添加--enable-aggregator-routing=true
启动参数,原因不明,因为在我另一个k8s 1.16.x集群中,并没有添加这项参数,工作正常。有知道原因的伙伴请不吝告知,在此谢过。
PS:网上有文章说如果master节点没有运行kube-proxy进程才需要加上这个启动参数,而我的集群中master节点是有运行kube-proxy的。
参考链接1:https://github.com/kubernetes-sigs/metrics-server/issues/448
参考链接2:https://blog.z0ukun.com/?p=1462