Prometheus是一款开源的监控系统,主要用于收集、存储和查询时间序列数据,以便于对系统进行监控和分析
Prometheus的架构由四个主要组件组成:
1、Prometheus Server :Prometheus Server是Prometheus的核心组件,主要负责从各个目标(target)中收集指标(metrics)数据,并对这些数据进行存储、聚合和查询。
2、Client Libraries :Prometheus提供了多种客户端库,用于在应用程序中嵌入Prometheus的指标收集功能。
3、Exporters :Exporters是用于将第三方系统的监控数据导出为Prometheus格式的组件。Prometheus支持多种Exporters,例如Node Exporter、MySQL Exporter、HAProxy Exporter等。
4、Alertmanager:Alertmanager是Prometheus的告警组件,用于根据用户定义的规则对监控数据进行告警。
Prometheus的特点
1、灵活的数据模型:Prometheus采用的是key-value对的形式存储指标数据,每个指标都可以包含多个标签(labels),这样可以更加灵活地描述指标数据
2、高效的存储和查询:Prometheus使用自己的时间序列数据库,可以高效地存储和查询大量的指标数据。
3、强大的可视化和告警功能:Prometheus提供了Web界面和API,可以方便地展示和查询监控数据。
4、可扩展性强:Prometheus的架构非常灵活,可以根据需要选择合适的组件进行配置。
CNCF的成员项目:Prometheus作为CNCF的项目之一,得到了广泛的关注和支持,并且得到了来自全球各地的贡献者的积极参与和开发.
1、下载
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
2、解压部署启动
tar -xf node_exporter-1.8.2.linux-amd64.tar.gz
ln -s node_exporter-1.8.2.linux-amd64 /usr/local/node_exporter
3、设置启动脚本
vim start_noder.sh
/usr/local/node_exporter/node_exporter \
--collector.textfile.directory=/usr/local/node_exporter/tmp/ \
--web.config.file=config.yml \
--web.listen-address=0.0.0.0:19100
4、附录config.yml文件配置(账号密码admin/123456 此文档中所有都是使用的该信息)
cat config.yml 
basic_auth_users:
  admin: $2y$12$Y9/tZwO8FJC2I.IPt47ufOwFZRNrjSOPk0rUtOhB97cXNdvCikFDW
1、此处需要使用到pyton3,推荐使用anaconda3进行安装,此处略,对应网址
https://www.anaconda.com/download
2、prometheus_client安装
python3 -m pip install  client_python-0.13.1.tar.gz 
3、设置开机自启动脚本
vim /usr/lib/systemd/system/proc_exporter.service
[Unit]
Description=proc_exporter
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/python3 /usr/local/proc_exporter/proc_exporter.py -c /usr/local/proc_exporter/proc_exporter.ini
Restart=on-failure
[Install]
WantedBy=multi-user.target
4、配置文件调整修改,按照如下格式进行业务模块添加删除
vim proc_exportter.ini
## 进程配置, 修改后生效, 不需要重启
[node_exporter]
## 进程名: 能够唯一标识进程的关键字, 如: node_exporter
name = node_exporter
## 进程模块: 进程所归属的子系统或模块, 如: prometheus,
moudle = prometheus
## 进程负责人: 当进程出现异常, 需要介入处理的开发人员
manager = 
## core文件目录, 配置绝对路径, 如不需要检测core文件则配空
directory =
## core文件名前缀
prefix = 
5、启动
 systemctl daemon-reload
 systemctl enable proc_exporter.service
 systemctl restart proc_exporter.service
1、下载
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
2、解压部署
tar -xf alertmanager-0.27.0.linux-amd64.tar.gz
ln -s alertmanager-0.27.0.linux-amd64 /usr/local/alertmanager
3、编写启动脚本
vim /usr/lib/systemd/system/alertmanager.service
[Unit]
Description=alertmanager server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
Type=simple
User=root
Group=root
Restart=on-abnormal
ExecStart=/usr/local/alertmanager/alertmanager \
  --config.file=/usr/local/alertmanager/alertmanager.yml \
  --web.listen-address=0.0.0.0:19093 \
  --web.config.file=/usr/local/alertmanager/config.yml \
[Install]
WantedBy=multi-user.target
4、配置文件调整
vim alertmanager.yml
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.mail.139.com:25'     # 邮箱smtp服务器
  smtp_from: 'hly12599-alarm@139.com'             # 发送邮箱名称
  smtp_auth_username: 'hly12599-alarm@139.com'    # 邮箱地址
  smtp_auth_password: '23bb4dee88805e0fb400'          # 邮箱密码
  smtp_require_tls: false
route:
  group_by: ['alertname']
  group_wait: 10s    
  group_interval: 5m
  repeat_interval: 3m
  receiver: 'alert-receiver'
  routes:
  - receiver: 'data'
    continue: true
  
templates:
  - './templates/*.tmpl'
receivers:
- name: 'data'
  webhook_configs:
  - url: 'http://192.168.10.139:5000/alertinfo'
- name: 'alert-receiver'
  email_configs:
  - to: 15901283579@139.com
    send_resolved: true
inhibit_rules:
  - source_match:
      severity: 'warning'
    target_match:
      severity: 'warning'
    equal: ['job', 'instance','severity']
####检查配置:./amtool check-config alertmanager.yml
5、启动
 systemctl daemon-reload
 systemctl enable alertmanager.service
 systemctl restart alertmanager.service
1、下载
wget https://github.com/prometheus/pushgateway/releases/download/v1.9.0/pushgateway-1.9.0.linux-amd64.tar.gz
2、解压部署
tar -xf pushgateway-1.9.0.linux-amd64.tar.gz
ln -s pushgateway-1.9.0.linux-amd64 /usr/local/pushgateway
3、编写启动文件
vim /usr/lib/systemd/system/pushgateway.service
[Unit]
Description=pushgateway
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=root
Group=root
Restart=always
ExecStart=/usr/local/pushgateway/pushgateway \
  --web.listen-address=0.0.0.0:19091  \
    --web.config.file=/usr/local/pushgateway/config.yml 
[Install]
WantedBy=multi-user.target
4、启动
 systemctl daemon-reload
 systemctl enable pushgateway.service
 systemctl restart pushgateway.service
1、下载
wget https://github.com/prometheus/prometheus/releases/download/v2.53.2/prometheus-2.53.2.linux-amd64.tar.gz
2、解压部署
tar -xf prometheus-2.53.2.linux-amd64.tar.gz
ln -s prometheus-2.53.2.linux-amd64 /usr/local/prometheus
3、编写启动脚本
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
Type=simple
User=root
Group=root
Restart=on-abnormal
ExecStart=/usr/local/prometheus/prometheus \
  --config.file=/usr/local/prometheus/prometheus.yml \
  --web.listen-address=0.0.0.0:19090 \
  --web.config.file=/usr/local/prometheus/config.yml \
  --storage.tsdb.path=/usr/local/prometheus/data \
  --storage.tsdb.retention.time=180d \
  --web.console.templates=/usr/local/monitor/prometheus/consoles \
  --web.console.libraries=/usr/local/monitor/prometheus/console_libraries \
  --web.max-connections=512 \
  --web.enable-lifecycle
[Install]
WantedBy=multi-user.target
4、启动
 systemctl daemon-reload
 systemctl enable prometheus.service
 systemctl restart prometheus.service
5、修改配置文件,添加主机监控和进程监控
vim prometheus.yml
global:
  scrape_interval: 60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 30s # Evaluate rules every 15 seconds. The default is every 1 minute.
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
    - "./rules/*.yml"
scrape_configs:
  - job_name: "node_host"
    basic_auth:
      username: admin
      password: 123456
    scrape_interval: 1m
    static_configs:
      - targets: ["192.168.10.139:19100"]
      
  - job_name: "proc_host"
    scrape_interval: 1m
    scrape_timeout: 1m
    metrics_path: /metrics
    static_configs:
      - targets: ["192.168.10.140:19001"]
      
  - job_name: "alertmanager"
    basic_auth:
      username: admin
      password: 123456
    static_configs:
      - targets: ["192.168.10.139:19093"]
      
  - job_name: "pushgateway_server"
    basic_auth:
      username: admin
      password: 123456
    honor_labels: true
    scrape_interval: 1m
    scrape_timeout: 1m
    static_configs:
      - targets: ["192.168.10.139:9091"]
6、加载生效
curl -X POST -u admin:123456  http://192.168.10.139:9090/-/reload
1、下载地址
wget https://dl.grafana.com/oss/release/grafana-10.3.7-1.x86_64.rpm
2、安装部署启动
rpm -Uvh grafana-10.3.7-1.x86_64.rpm
3、修改配置文件端口,然后启动即可
echo "http_port = 13000" >> /etc/grafana/grafana.ini
systemctl daemon-reload
systemctl enable grafana-server.service
systemctl restart grafana-server.service
4、通过web浏览器即可打开对应的web界面