Prometheus-部署
- Server端安装配置
- 部署Node Exporters监控系统指标
- 监控MySQL数据库
- 监控nginx
- 安装grafana
Server端安装配置
1、上传安装包,并解压
cd /opt/
tar xf prometheus-2.30.3.linux-amd64.tar.gz
mv prometheus-2.30.3.linux-amd64 /usr/local/prometheus
cat /usr/local/prometheus/prometheus.yml
# my global config
global: #用于prometheus的全局配置,比如采集间隔,抓取超时时间等scrape_interval: 15s # 采集目标主机监控数据的时间间隔,默认为1mevaluation_interval: 15s # 触发告警生成alert的时间间隔,默认是1m# scrape_timeout is set to the global default (10s).scrape_timeout: 10s # 数据采集超时时间,默认10s# Alertmanager configuration
alerting: # 用于alertmanager实例的配置,支持静态配置和动态服务发现的机制alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: # 用于加载告警规则相关的文件路径的配置,可以使用文件名通配机制# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs: #用于采集时序数据源的配置# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus" #每个被监控实例的集合用job_name命名,支持静态配置(static_configs)和动态
服务发现的机制(*_sd_configs)# metrics_path defaults to '/metrics'metrics_path: '/metrics' # 指标数据采集路径,默认为 /metrics# scheme defaults to 'http'.static_configs: # 静态目标配置,固定从某个target拉取数据- targets: ["localhost:9090"]
2、配置系统启动文件
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io
After=network.target[Service]
Type=simple
ExecStart=/usr/local/prometheus/prometheus \
--config.file=/usr/local/prometheus/prometheus.yml \
--storage.tsdb.path=/usr/local/prometheus/data/ \
--storage.tsdb.retention=15d \
--web.enable-lifecycleExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure[Install]
WantedBy=multi-user.target
3、启动服务
systemctl start prometheus
systemctl enable prometheus
4、网页访问
http://ip:9090/targets
点击页面的 Status -> Targets,如看到 Target 状态都为 UP,说明 Prometheus 能正常采集到数据
http://ip:9090/metrics
可以看到 Prometheus 采集到自己的指标数据,其中 Help 字段用于解释当前指标的含义,Type 字段用于说明数据的类型
部署Node Exporters监控系统指标
1、上传安装包,并解压
cd /opt/
tar xf node_exporter-1.3.1.linux-amd64.tar.gz
mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin
vim /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target[Service]
Type=simple
ExecStart=/usr/local/bin/node_exporter \
--collector.ntp \
--collector.mountstats \
--collector.systemd \
--collector.tcpstatExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure[Install]
WantedBy=multi-user.target
systemctl start node_exporter
systemctl enable node_exporter
netstat -natp | grep :9100
2、网页访问
http://ip:9100/metrics
常用的各指标:
node_cpu_seconds_total
node_memory_MemTotal_bytes
node_filesystem_size_bytes{mount_point=PATH}
node_system_unit_state{name=}
node_vmstat_pswpin:系统每秒从磁盘读到内存的字节数
node_vmstat_pswpout:系统每秒钟从内存写到磁盘的字节数
更多指标介绍:https://github.com/prometheus/node_exporter
#修改 prometheus 配置文件,加入到 prometheus 监控中
vim /usr/local/prometheus/prometheus.yml
#在尾部增加如下内容
- job_name: nodes
metrics_path: “/metrics”
static_configs:- targets:
- 192.168.200.141:9100
- 192.168.200.138:9100
labels:
service: test
- targets:
#重新载入配置
curl -X POST http://127.0.0.1:9090/-/reload 或 systemctl reload prometheus
监控MySQL数据库
监控nginx
安装grafana
yum install -y https://dl.grafana.com/enterprise/release/grafana-enterprise-11.1.3-1.x86_64.rpmsystemctl start grafana-server
systemctl enable grafana-server
netstat -natp | grep :3000
http://ip:3000
配置数据源
Connections -> Data Sources -> Add data source -> 选择 Prometheus
HTTP -> URL 输入 http://192.168.200.138:9090
点击 Save & Test
导入grafana模版
浏览器访问:https://grafana.com/grafana/dashboards ,在页面中搜索 node exporter ,选择适合的面板,点击 Copy ID 或者 Download JSON在 grafana 页面中,+ Create -> Import ,输入面板 ID 号或者上传 JSON 文件,点击 Load,即可导入监控面板