k8s集群部署vmalert和prometheusalert实现钉钉告警

先决条件

安装以下软件包:git, kubectl, helm, helm-docs,请参阅本教程。

1、安装 helm

wget https://xxx-xx.oss-cn-xxx.aliyuncs.com/helm-v3.8.1-linux-amd64.tar.gz
tar xvzf helm-v3.8.1-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin
rm -rf linux-amd64

2、安装victoria-metrics-alert

(1)使用以下命令添加 helm chart存储库

helm repo add vm https://victoriametrics.github.io/helm-charts/helm repo update

(2)列出vm/victoria-metrics-alert可供安装的helm版本

helm search repo vm/victoria-metrics-alert -l

(3)victoria-metrics-alert将图表的默认值导出到文件values.yaml

helm show values vm/victoria-metrics-alert > values.yaml

(4)根据环境需要更改values.yaml文件中的值,完整配置参考如下

# Default values for victoria-metrics-alert.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.serviceAccount:# Specifies whether a service account should be createdcreate: true# Annotations to add to the service accountannotations: {}# The name of the service account to use.# If not set and create is true, a name is generated using the fullname templatename:# mount API token to pod directlyautomountToken: trueimagePullSecrets: []rbac:create: truepspEnabled: truenamespaced: falseextraLabels: {}annotations: {}server:name: serverenabled: trueimage:repository: victoriametrics/vmalerttag: "" # rewrites Chart.AppVersionpullPolicy: IfNotPresentnameOverride: ""fullnameOverride: ""## See `kubectl explain poddisruptionbudget.spec` for more## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/podDisruptionBudget:enabled: false# minAvailable: 1# maxUnavailable: 1labels: {}# -- Additional environment variables (ex.: secret tokens, flags) https://github.com/VictoriaMetrics/VictoriaMetrics#environment-variablesenv:[]# - name: VM_remoteWrite_basicAuth_password#   valueFrom:#     secretKeyRef:#       name: auth_secret#       key: passwordreplicaCount: 1# deployment strategy, set to standard k8s defaultstrategy:type: RollingUpdaterollingUpdate:maxSurge: 25%maxUnavailable: 25%# specifies the minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing/terminating# 0 is the standard k8s defaultminReadySeconds: 0# vmalert reads metrics from source, next section represents its configuration. It can be any service which supports# MetricsQL or PromQL.datasource:url: "http://192.168.47.9:8481/select/0/prometheus/"basicAuth:username: ""password: ""remote:write:url: ""read:url: ""notifier:alertmanager:url: "http://192.168.112.68:9093"extraArgs:envflag.enable: "true"envflag.prefix: VM_loggerFormat: json# Additional hostPath mountsextraHostPathMounts:[]# - name: certs-dir#   mountPath: /etc/kubernetes/certs#   subPath: ""#   hostPath: /etc/kubernetes/certs#   readOnly: true# Extra Volumes for the podextraVolumes:[]#- name: example#  configMap:#    name: example# Extra Volume Mounts for the containerextraVolumeMounts:[]# - name: example#   mountPath: /exampleextraContainers:[]#- name: config-reloader#  image: reloader-imageservice:annotations: {}labels: {}clusterIP: ""## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips##externalIPs: []loadBalancerIP: ""loadBalancerSourceRanges: []servicePort: 8880type: ClusterIP# Ref: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip# externalTrafficPolicy: "local"# healthCheckNodePort: 0ingress:enabled: falseannotations: {}#   kubernetes.io/ingress.class: nginx#   kubernetes.io/tls-acme: 'true'extraLabels: {}hosts: []#   - name: vmselect.local#     path: /select#     port: httptls: []#   - secretName: vmselect-ingress-tls#     hosts:#       - vmselect.local# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress# ingressClassName: nginx# -- pathType is only for k8s >= 1.1=pathType: PrefixpodSecurityContext: {}# fsGroup: 2000securityContext:{}# capabilities:#   drop:#   - ALL# readOnlyRootFilesystem: true# runAsNonRoot: true# runAsUser: 1000resources:{}# We usually recommend not to specify default resources and to leave this as a conscious# choice for the user. This also increases chances charts run on environments with little# resources, such as Minikube. If you do want to specify resources, uncomment the following# lines, adjust them as necessary, and remove the curly braces after 'resources:'.# limits:#   cpu: 100m#   memory: 128Mi# requests:#   cpu: 100m#   memory: 128Mi# Annotations to be added to the deploymentannotations: {}# labels to be added to the deploymentlabels: {}# Annotations to be added to podpodAnnotations: {}podLabels: {}nodeSelector: {}priorityClassName: ""tolerations: []affinity: {}# vmalert alert rules configuration configuration:# use existing configmap if specified# otherwise .config values will be usedconfigMap: ""config:alerts:groups:- name: 磁盘挂载错误rules:- alert: 磁盘挂载错误expr: mount_error == 1for: 1mlabels:level: 1severity: warningannotations:description: "{{$labels.job}}链{{$labels.instance}}节点磁盘挂载错误!"serviceMonitor:enabled: falseextraLabels: {}annotations: {}
#    interval: 15s
#    scrapeTimeout: 5s# -- Commented. HTTP scheme to use for scraping.
#    scheme: https# -- Commented. TLS configuration to use when scraping the endpoint
#    tlsConfig:
#      insecureSkipVerify: truealertmanager:enabled: truereplicaCount: 1podMetadata:labels: {}annotations: {}image: prom/alertmanagertag: v0.20.0retention: 120hnodeSelector: {}priorityClassName: ""resources: {}tolerations: []imagePullSecrets: []podSecurityContext: {}extraArgs: {}# key: value# external URL, that alertmanager will expose to receiversbaseURL: ""# use existing configmap if specified# otherwise .config values will be usedconfigMap: ""config:global:resolve_timeout: 5mroute:# default receiverreceiver: ops_notify# tag to group bygroup_by: [alertname]# How long to initially wait to send a notification for a group of alertsgroup_wait: 30s# How long to wait before sending a notification about new alerts that are added to a groupgroup_interval: 60s# How long to wait before sending a notification again if it has already been sent successfully for an alertrepeat_interval: 1hreceivers:- name: ops_notifywebhook_configs:- url: http://192.168.157.59:8080/prometheusalert?type=dd&tpl=prometheus-dd&split=falsesend_resolved: trueinhibit_rules:- source_match:severity: 'warning'target_match:severity: 'warning'equal: ['alertname', 'job']templates: {}#  alertmanager.tmpl: |-service:annotations: {}type: ClusterIPport: 9093# if you want to force a specific nodePort. Must be use with service.type=NodePort# nodePort:ingress:enabled: falseannotations: {}#   kubernetes.io/ingress.class: nginx#   kubernetes.io/tls-acme: 'true'extraLabels: {}hosts: []#   - name: alertmanager.local#     path: /#     port: webtls: []#   - secretName: alertmanager-ingress-tls#     hosts:#       - alertmanager.local# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress# ingressClassName: nginx# -- pathType is only for k8s >= 1.1=pathType: PrefixpersistentVolume:# -- Create/use Persistent Volume Claim for alertmanager component. Empty dir if falseenabled: false# -- Array of access modes. Must match those of existing PV or dynamic provisioner. Ref: [http://kubernetes.io/docs/user-guide/persistent-volumes/](http://kubernetes.io/docs/user-guide/persistent-volumes/)accessModes:- ReadWriteOnce# -- Persistant volume annotationsannotations: {}# -- StorageClass to use for persistent volume. Requires alertmanager.persistentVolume.enabled: true. If defined, PVC created automaticallystorageClass: ""# -- Existing Claim name. If defined, PVC must be created manually before volume will be boundexistingClaim: ""# -- Mount path. Alertmanager data Persistent Volume mount root path.mountPath: /data# -- Mount subpathsubPath: ""# -- Size of the volume. Better to set the same as resource limit memory property.size: 50Mi

(5)使用命令测试安装:

helm install vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics --debug --dry-run

(6)使用以下命令安装

helm install vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics

(7)通过运行以下命令获取 pod 列表

kubectl get pods -A | grep 'alert'

(8)通过运行以下命令获取应用程序

helm list -f vmalert -n victoria-metrics

(9)使用命令查看应用程序版本的历史记录vmalert

helm history vmalert -n victoria-metrics

(10)更新配置

cd /root/vmalert
#修改value.yaml文件
helm upgrade vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics

(11)查看service

kubectl get svc -n victoria-metrics

3、安装prometheusalert

(1)使用helm部署

git clone https://github.com/feiyu563/PrometheusAlert.git
cd PrometheusAlert/example/helm/prometheusalert
#如需修改配置文件,请更新config中的app.conf
helm install -n victoria-metrics prometheus-alert .

(2)values.yaml配置文件参考

cat /root/PrometheusAlert/example/helm/prometheusalert/values.yaml

# Default values for prometheusalert.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.global:imagePullSecrets: []# - name: "registry-secret"replicaCount: 1image:# 支持配置自定义模版需要重出镜像,或者使用本人构建镜像:lusson/prometheusalert:v1.0repository: feiyu563/prometheus-alert:v4.8pullPolicy: IfNotPresentnameOverride: ""
fullnameOverride: ""service:type: ClusterIPport: 8080ingress:enabled: trueannotations: {}# kubernetes.io/ingress.class: nginx# kubernetes.io/tls-acme: "true"hosts:- host: prometheusalert.xxxxx.compaths: ["/"]tls: []resources:limits:cpu: 1000mmemory: 1024Mirequests:cpu: 100mmemory: 128MinodeSelector: {}tolerations: []affinity: {}

(3)app.conf配置参考

cat /root/PrometheusAlert/example/helm/prometheusalert/config/app.conf

(4)ingress.yaml配置参考

cat /root/PrometheusAlert/example/helm/prometheusalert/templates/ingress.yaml{{- if .Values.ingress.enabled -}}
{{- $fullName := include "prometheusalert.fullname" . -}}
{{- $svcPort := .Values.service.port -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:name: {{ $fullName }}labels:
{{ include "prometheusalert.labels" . | indent 4 }}{{- with .Values.ingress.annotations }}annotations:{{- toYaml . | nindent 4 }}{{- end }}
spec:
{{- if .Values.ingress.tls }}tls:{{- range .Values.ingress.tls }}- hosts:{{- range .hosts }}- {{ . | quote }}{{- end }}secretName: {{ .secretName }}{{- end }}
{{- end }}rules:{{- range .Values.ingress.hosts }}- host: {{ .host | quote }}http:paths:{{- range .paths }}- path: {{ . }}pathType: Prefixbackend:service:name: {{ $fullName }}port:number: {{ $svcPort }}{{- end }}{{- end }}
{{- end }}

(5)更新配置

cd /root/PrometheusAlert/example/helm/prometheusalert
helm upgrade -n victoria-metrics prometheus-alert .

(6)重启pod

删除Pod
helm delete prometheus-alert -n victoria-metrics查看pods和service
kubectl get pods -n victoria-metrics
kubectl get svc -n victoria-metrics重新安装
helm install -n victoria-metrics prometheus-alert .查看pods和service
kubectl get pods -n victoria-metrics
kubectl get svc -n victoria-metrics

(1)告警测试

模板内容:

{{ $var := .externalURL}}{{ range $k,$v:=.alerts }}
{{if eq $v.status "resolved"}}
## [巡检恢复信息]({{$v.generatorURL}})
#### [{{$v.labels.alertname}}]({{$var}})
###### 告警级别:{{$v.labels.level}}
###### 开始时间:{{$v.startsAt}}
###### 故障主机:{{$v.labels.instance}}
##### {{$v.annotations.description}}
{{else}}
## [巡检告警信息]({{$v.generatorURL}})
#### [{{$v.labels.alertname}}]({{$var}})
###### 告警级别:{{$v.labels.level}}
###### 开始时间:{{$v.startsAt}}
###### 故障主机:{{$v.labels.instance}}
##### {{$v.annotations.description}}
{{end}}
{{ end }}

(2)查看日志

(3)查看钉钉告警

参考文档:https://github.com/VictoriaMetrics/helm-charts/tree/master/charts/victoria-metrics-alert

参考文档:https://github.com/feiyu563/PrometheusAlert/tree/master/example/helm/prometheusalert

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/91891.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Pycharm找不到Conda可执行文件路径(Pycharm无法导入Anaconda已有环境)

在使用Pycharm时发现无法导入Anaconda创建好的环境,会出现找不到Conda可执行文件路径的问题。 解决 在输入框内输入D:\anaconda3\Scripts\conda.exe,点击加载环境。 注意前面目录是自己Anaconda的安装位置,之后就可以找到Anaconda的现有环…

嵌入式电火花线切割控制系统总体设计

2.1 电火花线切割机床的特点与结构 电火花线切割加工( Wire Cut EDM )是特种加工中电火花加工方式的一种,是 直接利用电能或热能进行加工的工艺方法。加工基本原理是利用在导丝架固定的轨 道上连续移动电极丝(钼丝 / 铜丝&…

【Java 集合框架API接口】Collection,List,Set,Map,Queue,Deque

博主:_LJaXi Or 東方幻想郷 专栏: Java | 从跨行业到跨平台 开发工具:IntelliJ IDEA 2021.1.3 Java集合框架 API接口 Collection接口List接口HashSet, TreeSetSet接口使用 HashSet 实现使用 TreeSet 实现 HashMap、TreeMapMap接口…

List和数组互转方法以及踩坑点

一、数组转List 1. 使用for循环逐个添加 String[] arr {"A", "B", "C"}; List<String> list new ArrayList<>(); for (String element : arr) {list.add(element); }2. 使用Arrays.asList(arr) String[] arr {"A", …

eNSP 配置交换机三种端口链路类型:Access、Trunk、Hybird

文章目录 1 概述1.1 总结&#xff1a;access、trunk、hybird 2 三种端口链路类型2.1 Access2.1.1 报文处理流程2.1.2 命令配置实验 2.2 Trunk2.2.1 报文处理流程2.2.2 命令配置实验 2.3 hybird2.3.1 报文处理流程2.3.2 命令配置实验 3 扩展3.1 查看 vlan 信息&#xff1a;displ…

Linux 僵死进程

fork复制进程之后&#xff0c;会产生一个进程叫做子进程&#xff0c;被复制的进程就是父进程。不管父进程先结束&#xff0c;还是子进程先结束&#xff0c;对另外一个进程完全没有影响&#xff0c;父进程和子进程是两个不同的进程。 一、孤儿进程 现在有以下代码&#xff1a;…

2023,家用美容仪的“春天”来了吗?

【潮汐商业评论/原创】 编辑部的Jessica又买了一台水牙线&#xff0c;用她的话说&#xff1a;“能让自己更完美为什么不去试试呢&#xff1f;” 事实上&#xff0c;像这样的个护产品&#xff0c;Jessica不止一两个&#xff0c;从腰颈按摩仪到护肤导入仪、从全脸射频仪再到全身…

【2022吴恩达机器学习课程视频翻译笔记】3.3代价函数公式

忙了一阵子&#xff0c;回来继续更新 3.3 代价函数公式 In order to implement linear regression. The first key step is first to define something called a cost function. This is something we’ll build in this video, and the cost function will tell us how well…

代理模式概述

1.代理模式概述 学习内容 1&#xff09;概述 为什么要有 “代理” &#xff1f; 生活中就有很多例子&#xff0c;比如委托业务&#xff0c;黄牛&#xff08;票贩子&#xff09;等等代理就是被代理者没有能力或者不愿意去完成某件事情&#xff0c;需要找个人代替自己去完成这…

Hugging News #0814: Llama 2 学习资源大汇总

每一周&#xff0c;我们的同事都会向社区的成员们发布一些关于 Hugging Face 相关的更新&#xff0c;包括我们的产品和平台更新、社区活动、学习资源和内容更新、开源库和模型更新等&#xff0c;我们将其称之为「Hugging News」。本期 Hugging News 有哪些有趣的消息&#xff0…

地址解析协议-ARP

ARP协议 无论网络层使用何种协议&#xff0c;在实际网络的链路上传输数据帧时&#xff0c;最终必须使用硬件地址 地址解析协议&#xff08;Address Resolution Protocol&#xff0c;ARP&#xff09;&#xff1a;完成IP地址到MAC地址的映射&#xff0c;每个主机都有一个ARP高速缓…

企业权限管理(十)-用户详情

用户详情 UserController findById方法 Controller RequestMapping("/user") public class UserController {Autowiredprivate IUserService userService;//查询指定id的用户RequestMapping("/findById.do")public ModelAndView findById(String id) thro…

htmlCSS-----弹性布局案例展示

目录 前言 效果展示 ​编辑 代码 思路分析 前言 上一期我们学习了弹性布局&#xff0c;那么这一期我们用弹性布局来写一个小案例&#xff0c;下面看代码&#xff08;上一期链接html&CSS-----弹性布局_灰勒塔德的博客-CSDN博客&#xff09; 效果展示 代码 html代码&am…

商用汽车转向系统常见故障解析

摘要&#xff1a; 车辆转向系统是用于改变或保持汽车行驶方向的专门机构。其作用是使汽车在行驶过程中能按照驾驶员的操纵意图而适时地改变其行驶方向&#xff0c;并在受到路面传来的偶然冲击及车辆意外地偏离行驶方向时&#xff0c;能与行驶系统配合共同保持车辆继续稳定行驶…

新能源汽车电控系统

新能源汽车电控系统主要分为&#xff1a;三电系统电控系统、高压系统电控系统、低压系统电控系统 三电系统电控系统 包括整车控制器、电池管理系统、驱动电机控制器等。 整车控制器VCU 整车控制器作为电动汽车中央控制单元&#xff0c;是整个控制系统的核心&#xff0c;也是…

实验二十九、正弦波变锯齿波电路

一、题目 将峰值为 1 V 1\,\textrm V 1V、频率为 100 Hz 100\,\textrm{Hz} 100Hz 的正弦波输入电压&#xff0c;变换为峰值为 5 V 5\,\textrm V 5V、频率为 200 Hz 200\,\textrm {Hz} 200Hz 的锯齿波电压。利用 Multisim 对所设计的电路进行仿真、修改&#xff0c;直至满足…

CentOS防火墙操作:开启端口、开启、关闭、配置

一、基本使用 启动&#xff1a; systemctl start firewalld 关闭&#xff1a; systemctl stop firewalld 查看状态&#xff1a; systemctl status firewalld 开机禁用 &#xff1a; systemctl disable firewalld 开机启用 &#xff1a; systemctl enable firewalld systemctl是…

actuator/prometheus使用pushgateway上传jvm监控数据

场景 准备 prometheus已经部署pushgateway服务&#xff0c;访问{pushgateway.server:9091}可以看到面板 实现 基于springboot引入支持组件&#xff0c;版本可以 <!--监控检查--><dependency><groupId>org.springframework.boot</groupId><artifa…

Blazor简单教程(1.1):Razor基础语法

文章目录 前言基本文件配置引入Layout组件 语法介绍pagecodeRazor 语法[ 显式表达和隐式表达](https://learn.microsoft.com/zh-cn/aspnet/core/mvc/views/razor?viewaspnetcore-7.0#explicit-razor-expressions) 绑定简单绑定双向绑定带参数的函数绑定 依赖注入 前言 Blazor…

07 线程学习

一 qt线程角色 子线程完成与UI线程无关的工作,并且能够保持与UI线程通信 二 qt中线程 在QT中,对于 线程操作也是提供类(QThread)来进行封装,然后再学习该类的API接口 参数用途Header:#include qmake:QT += coreInherits:QObject //继承于QObject从官方文档可以看出,如果一…