117. 如何在Rancher监控中测试 AlertManager
Procedure 程序This guide demonstrates how to test Alertmanager and PrometheusRule configuration, to validate that alerts are sent successfully by Alertmanager.本指南演示如何测试 AlertManager 和 PrometheusRule 配置以验证 AlertManager 是否成功发送了警报。With this objective in mind, and for this test to be self-contained, a webhook receiver is configured in Alertmanager. A webhook-receiver pod is deployed to receive these webhook alert requests and print them to stdout, such that they are visible in the Pod logs for verification. All of these resources are created in the cattle-monitoring-system.基于这一目标并使测试自成一体在 Alertmanager 中配置了一个 webhook 接收器。部署一个 webhook-receiver pod 接收这些 webhook 警报请求并打印至 stdout使其在 Pod 日志中可见以便验证。所有这些资源均在牛群监控系统中创建。Navigate to a Rancher-managed cluster with rancher-monitoring installed.导航到安装了牧场监控的牧场主管理集群。Apply the following YAMLs:应用以下 YAMLConfigMap: 配置地图span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeapiVersion: v1 kind: ConfigMap metadata: name: webhook-receiver-configmap-script namespace: cattle-monitoring-system data: a>Pod: 播客span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeapiVersion: v1 kind: Pod metadata: name: webhook-receiver namespace: cattle-monitoring-system labels: app: webhook-receiver spec: containers: - name: receiver-container image: rancherlabs/swiss-army-knife:latest command: [/bin/bash, /script/a>Service: 服务span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeapiVersion: v1 kind: Service metadata: name: webhook-receiver-service namespace: cattle-monitoring-system spec: selector: app: webhook-receiver ports: - protocol: TCP port: 80 targetPort: 8080 type: ClusterIP /code/span/span/spanEnsure that the pod is up and tail the log, you should see a couple of lines stating that the netcat listener is ready and waiting for a connection. The Alertmanager alert configured below will be visible in these logs.确保 Pod 已上线并跟踪日志你应该会看到几行显示 netcat 监听器已准备好并等待连接。下面配置的 Alertmanager 警报会在这些日志中显示。Apply the following AlertmanagerConfig to configure Alertmanager to send any alerts with the label severitycritical to the webhook-receiver pod (the Alertmanager configuration documentation can be found here). Note that the URL used is that of the service created above:应用以下 AlertmanagerConfig 来配置 AlertManager将任何标签为“severitycritical”的警报发送到 webhook-receiver podAlertManager 配置文件文档可在此处找到。请注意所使用的 URL 是上述创建服务的地址span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeapiVersion: a>Create a PrometheusRule with an alert expression. This example uses vector(1) as the expression, such that its value will be always 1 and the alert will be trigged continuously:创建一个带有警报表达式的 Prometheus 规则。本示例使用向量1作为表达式使其值始终为“1”警报将持续触发span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeapiVersion: a>Wait for the alert to appear in the Alertmanager Alerts UI.等待警报出现在警报管理器的警报界面中。Check the log of the webhook-receiver pod and observe that the test-rule alert is received, similar to the following:检查 webhook-receiver pod 的日志观察测试规则警报是否已接收类似于以下内容span stylecolor:#000000span stylebackground-color:#ffffffspan stylebackground-color:#efefefcodeStarting nc listener with 1 second timeout... Waiting for connection... --- RECEIVED FULL REQUEST --- POST / HTTP/1.1 Host: webhook-receiver-service User-Agent: Alertmanager/0.28.1 Content-Length: 1214 Content-Type: application/json {receiver:cattle-monitoring-system/webhook-receiver-am-config/webhook-receiver-pod,status:firing,alerts:[{status:firing,labels:{alertname:test-alert,namespace:cattle-monitoring-system,prometheus:cattle-monitoring-system/rancher-monitoring-prometheus,severity:critical},annotations:{},startsAt:2025-10-14T09:04:11.437Z,endsAt:0001-01-01T00:00:00Z,generatorURL:a>Following this method, it is possible to test Alertmanager and PrometheusRule configurations without needing a third party app or configuring an external receiver. This is useful to see if the alerts arrive as expected or if they are not being sent. If you are struggling to correctly apply an AlertmanagerConfig, you can check the rancher-monitoring-operator pod logs, in order to check that the syntax is correct and was accepted; the Alertmanager pod logs; as well as the value of the PrometheusRule expression, using the Prometheus Query UI, to confirm whether the alert should currently trigger.按照这种方法可以在无需第三方应用或配置外部接收器的情况下测试 Alertmanager 和 PrometheusRule 的配置。这有助于判断警报是否按预期到达或者是否未发送。如果你在正确应用 AlertmanagerConfig 时遇到困难可以检查 rancher-monitoring-operator 的 pod 日志以确认语法正确且已被接受;Alertmanager Pod 日志;以及使用 Prometheus 查询界面的 PrometheusRule 表达式值以确认警报当前是否应触发。Environment 环境A Kubernetes cluster managed by Rancher v2.6 with rancher-monitoring installed由 Rancher v2.6 管理的 Kubernetes 集群安装了 rancher-monitoring农场监控功能访问Rancher-K8S解决方案博主企业合作伙伴 https://blog.csdn.net/lidw2009
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2479743.html
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!