项目场景:
prometheus scheduler及kube-controller-manager监控报错
问题描述
kubeadm搭建完kube-prometheus 会有这个报错
原因分析:
root@master2:~# kubectl describe servicemonitor -n kube-system kube-controller-manager
通过以上图片我们发现 k8s会去 kube-system 下的svc里找带有 app.kubernetes.io/name
标签的svc
root@master2:~# kubectl get svc -n kube-system -l app.kubernetes.io/name=kube-controller-manager
No resources found in monitoring namespace.
这里并没有这个标签
解决方案:
1) 我们需要把监听地址改成0.0.0.0
我们这里是kubeadm安装的 修改完这个文件 即可生效, 所有master节点都要配置
- --bind-address=127.0.0.1改为- --bind-address=0.0.0.0root@master2:~# vim /etc/kubernetes/manifests/kube-controller-manager.yaml- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf- --bind-address=0.0.0.0- --client-ca-file=/etc/kubernetes/pki/ca.crt
...
2) 把符合标签的svc创建出来
apiVersion: v1
kind: Endpoints
metadata:annotations:app.kubernetes.io/name: kube-controller-managername: kube-controller-manager-monitoringnamespace: kube-system
subsets:
- addresses:- ip: 192.168.1.27- ip: 192.168.1.28- ip: 192.168.1.29ports:- name: https-metricsport: 10257protocol: TCP
---
apiVersion: v1
kind: Service
metadata:labels:app.kubernetes.io/name: kube-controller-managername: kube-controller-manager-monitoringnamespace: kube-system
spec:ports:- name: https-metricsport: 10257protocol: TCPtargetPort: 10257sessionAffinity: Nonetype: ClusterIP
标签要一致 端口也要一致
kube-controller-manager。 scheduler. 解决方法一样
现在再看已经不报错了