概述
阅读官方文档部署部署Prometheus + Grafana
GitHub - prometheus-operator/kube-prometheus at release-0.10
环境
步骤
下周官方github仓库
git clone https://github.com/prometheus-operator/kube-prometheus.git
git checkout release-0.10
进入工作目录
cd kube-prometheus/manifests
mkdir -p adapter alertmanager blackbox grafana kube-state-metrics node-exporter operator prometheus
修改镜像地址
主要是镜像难寻,全靠运气。
prometheusAdapter-deployment.yaml
修改为:thinkingdata/prometheus-adapter:v0.10.0 镜像可用
kubeStateMetrics-deployment.yaml
修改为:bitnami/kube-state-metrics:2.7.0 镜像可用
部署服务
根据官方文档部署服务
kubectl apply --server-side -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl apply -f manifests/
验证
NodePort服务暴露
由ClusterIP修改为NodePort
网页访问验证:
只能在指定的节点访问服务。
如图是:
- node04访问grafana + Prometheus。
- master03访问Prometheus
默认登录账号密码都为 admin
登录就会要求你重设密码,重设密码仍为admin
Ingress服务暴露
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:name: prom-ingressnamespace: monitoring
spec:ingressClassName: nginxrules:- host: alert.k8s.comhttp:paths:- path: /pathType: Prefixbackend:service:name: alertmanager-mainport:number: 9093- host: grafana.k8s.comhttp:paths:- path: /pathType: Prefixbackend:service:name: grafanaport:number: 3000- host: prom.k8s.comhttp:paths:- path: /pathType: Prefixbackend:service:name: prometheus-k8sport:number: 9090
配置域名解析
配置Grafana
第一次登录账户密码都使用admin,详细教程参考:Grafana fundamentals | Grafana Labs
测试:
引入基本Dashboard
中文dashboard
K8S Dashboard CN 20211010 StarsL.cn | Grafana Labs
引入dashboard
导入刚才下载的json文件即可
查询其他Dashboard
Dashboards | Grafana Labs
成果:
配置邮件告警
查询grafana.ini配置在哪里
修改配置文件并重新部署kubectl apply -f grafana-config.yaml
Grafana配置SMTP接入邮箱 - 乱七八糟博客备份 - 博客园 (cnblogs.com)
grafana配置文件说明 - woaibaobei - 博客园 (cnblogs.com)
Grafana配置邮件告警_grafana配置邮件报警_lee_yanyi的博客-CSDN博客
删除grafana的原有pod,必须删除相当于重新加载配置
配置邮件发送SMTP其他邮件服务同理
测试联通性
安装插件
Error
ingress 部署失败
Error from server (InternalError): error when creating "ingress.yml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Posetworking/v1/ingresses?timeout=10s": dial tcp 10.96.222.96:443: connect: connection refused
删除错误pod
kubectl get pod -n ingress-nginx | grep Evicted | awk '{print $1}' | xargs kubectl delete pod -n ingress-nginx
重新部署pod
kubectl delete pod/ingress-nginx-controller-6ff65d977f-q2kw9 -n ingress-nginx
备注
使用kube-prometheus部署k8s监控(最新版)_净夜凡尘的博客-CSDN博客
Deploy Grafana on Kubernetes | Grafana documentation
kubernetes 部署Prometheus监控集群传统部署方案_ghostwritten的博客-CSDN博客
GitHub - starsliao/Prometheus: Grafana Dashboards for Prometheus Exporter