一、Prometheus介绍
Prometheus是一款开源的监控系统,主要用于收集、存储和查询时间序列数据,以便于对系统进行监控和分析Prometheus的架构由四个主要组件组成:
1、Prometheus Server :Prometheus Server是Prometheus的核心组件,主要负责从各个目标(target)中收集指标(metrics)数据,并对这些数据进行存储、聚合和查询。
2、Client Libraries :Prometheus提供了多种客户端库,用于在应用程序中嵌入Prometheus的指标收集功能。
3、Exporters :Exporters是用于将第三方系统的监控数据导出为Prometheus格式的组件。Prometheus支持多种Exporters,例如Node Exporter、MySQL Exporter、HAProxy Exporter等。
4、Alertmanager:Alertmanager是Prometheus的告警组件,用于根据用户定义的规则对监控数据进行告警。Prometheus的特点
1、灵活的数据模型:Prometheus采用的是key-value对的形式存储指标数据,每个指标都可以包含多个标签(labels),这样可以更加灵活地描述指标数据
2、高效的存储和查询:Prometheus使用自己的时间序列数据库,可以高效地存储和查询大量的指标数据。
3、强大的可视化和告警功能:Prometheus提供了Web界面和API,可以方便地展示和查询监控数据。
4、可扩展性强:Prometheus的架构非常灵活,可以根据需要选择合适的组件进行配置。
CNCF的成员项目:Prometheus作为CNCF的项目之一,得到了广泛的关注和支持,并且得到了来自全球各地的贡献者的积极参与和开发.
二、Prometheus部署搭建
1、node_exporter部署搭建
1、下载
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz2、解压部署启动
tar -xf node_exporter-1.8.2.linux-amd64.tar.gz
ln -s node_exporter-1.8.2.linux-amd64 /usr/local/node_exporter3、设置启动脚本
vim start_noder.sh
/usr/local/node_exporter/node_exporter \
--collector.textfile.directory=/usr/local/node_exporter/tmp/ \
--web.config.file=config.yml \
--web.listen-address=0.0.0.0:191004、附录config.yml文件配置(账号密码admin/123456 此文档中所有都是使用的该信息)
cat config.yml
basic_auth_users:admin: $2y$12$Y9/tZwO8FJC2I.IPt47ufOwFZRNrjSOPk0rUtOhB97cXNdvCikFDW
2、proc_exporter部署
1、此处需要使用到pyton3,推荐使用anaconda3进行安装,此处略,对应网址
https://www.anaconda.com/download2、prometheus_client安装
python3 -m pip install client_python-0.13.1.tar.gz 3、设置开机自启动脚本
vim /usr/lib/systemd/system/proc_exporter.service
[Unit]
Description=proc_exporter
After=network.target[Service]
Type=simple
ExecStart=/usr/bin/python3 /usr/local/proc_exporter/proc_exporter.py -c /usr/local/proc_exporter/proc_exporter.ini
Restart=on-failure[Install]
WantedBy=multi-user.target4、配置文件调整修改,按照如下格式进行业务模块添加删除
vim proc_exportter.ini
## 进程配置, 修改后生效, 不需要重启
[node_exporter]
## 进程名: 能够唯一标识进程的关键字, 如: node_exporter
name = node_exporter## 进程模块: 进程所归属的子系统或模块, 如: prometheus,
moudle = prometheus## 进程负责人: 当进程出现异常, 需要介入处理的开发人员
manager = ## core文件目录, 配置绝对路径, 如不需要检测core文件则配空
directory =## core文件名前缀
prefix = 5、启动systemctl daemon-reloadsystemctl enable proc_exporter.servicesystemctl restart proc_exporter.service
3、Alertmanager部署
1、下载
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz2、解压部署
tar -xf alertmanager-0.27.0.linux-amd64.tar.gz
ln -s alertmanager-0.27.0.linux-amd64 /usr/local/alertmanager3、编写启动脚本
vim /usr/lib/systemd/system/alertmanager.service
[Unit]
Description=alertmanager server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target[Service]
Type=simple
User=root
Group=root
Restart=on-abnormal
ExecStart=/usr/local/alertmanager/alertmanager \--config.file=/usr/local/alertmanager/alertmanager.yml \--web.listen-address=0.0.0.0:19093 \--web.config.file=/usr/local/alertmanager/config.yml \[Install]
WantedBy=multi-user.target4、配置文件调整
vim alertmanager.yml
global:resolve_timeout: 5msmtp_smarthost: 'smtp.mail.139.com:25' # 邮箱smtp服务器smtp_from: 'hly12599-alarm@139.com' # 发送邮箱名称smtp_auth_username: 'hly12599-alarm@139.com' # 邮箱地址smtp_auth_password: '23bb4dee88805e0fb400' # 邮箱密码smtp_require_tls: falseroute:group_by: ['alertname']group_wait: 10s group_interval: 5mrepeat_interval: 3mreceiver: 'alert-receiver'routes:- receiver: 'data'continue: truetemplates:- './templates/*.tmpl'
receivers:
- name: 'data'webhook_configs:- url: 'http://192.168.10.139:5000/alertinfo'
- name: 'alert-receiver'email_configs:- to: 15901283579@139.comsend_resolved: trueinhibit_rules:- source_match:severity: 'warning'target_match:severity: 'warning'equal: ['job', 'instance','severity']
####检查配置:./amtool check-config alertmanager.yml5、启动systemctl daemon-reloadsystemctl enable alertmanager.servicesystemctl restart alertmanager.service
4、pushgateway部署
1、下载
wget https://github.com/prometheus/pushgateway/releases/download/v1.9.0/pushgateway-1.9.0.linux-amd64.tar.gz2、解压部署
tar -xf pushgateway-1.9.0.linux-amd64.tar.gz
ln -s pushgateway-1.9.0.linux-amd64 /usr/local/pushgateway3、编写启动文件
vim /usr/lib/systemd/system/pushgateway.service
[Unit]
Description=pushgateway
Wants=network-online.target
After=network-online.target[Service]
Type=simple
User=root
Group=root
Restart=always
ExecStart=/usr/local/pushgateway/pushgateway \--web.listen-address=0.0.0.0:19091 \--web.config.file=/usr/local/pushgateway/config.yml [Install]
WantedBy=multi-user.target4、启动systemctl daemon-reloadsystemctl enable pushgateway.servicesystemctl restart pushgateway.service
5、prometheus部署
1、下载
wget https://github.com/prometheus/prometheus/releases/download/v2.53.2/prometheus-2.53.2.linux-amd64.tar.gz2、解压部署
tar -xf prometheus-2.53.2.linux-amd64.tar.gz
ln -s prometheus-2.53.2.linux-amd64 /usr/local/prometheus3、编写启动脚本
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target[Service]
Type=simple
User=root
Group=root
Restart=on-abnormal
ExecStart=/usr/local/prometheus/prometheus \--config.file=/usr/local/prometheus/prometheus.yml \--web.listen-address=0.0.0.0:19090 \--web.config.file=/usr/local/prometheus/config.yml \--storage.tsdb.path=/usr/local/prometheus/data \--storage.tsdb.retention.time=180d \--web.console.templates=/usr/local/monitor/prometheus/consoles \--web.console.libraries=/usr/local/monitor/prometheus/console_libraries \--web.max-connections=512 \--web.enable-lifecycle[Install]
WantedBy=multi-user.target4、启动systemctl daemon-reloadsystemctl enable prometheus.servicesystemctl restart prometheus.service5、修改配置文件,添加主机监控和进程监控
vim prometheus.yml
global:scrape_interval: 60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 30s # Evaluate rules every 15 seconds. The default is every 1 minute.
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:- "./rules/*.yml"
scrape_configs:- job_name: "node_host"basic_auth:username: adminpassword: 123456scrape_interval: 1mstatic_configs:- targets: ["192.168.10.139:19100"]- job_name: "proc_host"scrape_interval: 1mscrape_timeout: 1mmetrics_path: /metricsstatic_configs:- targets: ["192.168.10.140:19001"]- job_name: "alertmanager"basic_auth:username: adminpassword: 123456static_configs:- targets: ["192.168.10.139:19093"]- job_name: "pushgateway_server"basic_auth:username: adminpassword: 123456honor_labels: truescrape_interval: 1mscrape_timeout: 1mstatic_configs:- targets: ["192.168.10.139:9091"]6、加载生效
curl -X POST -u admin:123456 http://192.168.10.139:9090/-/reload
6、Grafana部署
1、下载地址
wget https://dl.grafana.com/oss/release/grafana-10.3.7-1.x86_64.rpm2、安装部署启动
rpm -Uvh grafana-10.3.7-1.x86_64.rpm3、修改配置文件端口,然后启动即可
echo "http_port = 13000" >> /etc/grafana/grafana.ini
systemctl daemon-reload
systemctl enable grafana-server.service
systemctl restart grafana-server.service4、通过web浏览器即可打开对应的web界面