参考
https://prometheus.io/docs/instrumenting/writing_exporters/ https://grafana.com/docs/grafana/latest/getting-started/build-first-dashboard/ https://grafana.com/docs/grafana/latest/alerting/alerting-rules/create-grafana-managed-rule/
安装 下载软件包 Prometheus
:https://prometheus.io/download/ Grafana
:https://grafana.com/grafana/download
prometheus-2.41.0.windows-amd64.zip grafana-9.3.2.windows-amd64.zip
Prometheus Exporter:windows_exporter
:https://github.com/prometheus-community/windows_exporter blackbox_exporter
:https://github.com/prometheus/blackbox_exporter
WinSW封装 Config File:
配置 Prometheus prometheus.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 global: scrape_interval: 60s evaluation_interval: 60s rule_files: scrape_configs: - job_name: "prometheus" static_configs: - targets: ["localhost:9090" ] - job_name: 'ServerNodeStatus_79' static_configs: - targets: ['10.170.10.79:9182' ] - job_name: 'ServerNodeStatus_78' static_configs: - targets: ['10.170.10.78:9182' ] - job_name: 'ServerNodeStatus_82[DB]' static_configs: - targets: ['10.170.10.82:9182' ] - job_name: "AppStatus_78" metrics_path: /probe params: module: [http_2xx ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_*_78.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115 - job_name: "AppStatus_79" metrics_path: /probe params: module: [http_2xx ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_*_79.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115 - job_name: "ServiceStatus_LIS" metrics_path: /probe params: module: [http_2xx ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_ServiceStatus_LIS.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115 - job_name: "ServiceStatus_LIS_POST" metrics_path: /probe params: module: [http_post_2xx ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_ServiceStatus_LIS_POST.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115 - job_name: "ServiceStatus_COLL" metrics_path: /probe params: module: [http_2xx ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_ServiceStatus_COLL.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115 - job_name: "DBStatus_LIS" metrics_path: /probe params: module: [tcp_connect ] file_sd_configs: - refresh_interval: 1m files: - "D:/A_Prometheus/Probe_DBStatus_LIS.yml" relabel_configs: - source_labels: [__address__ ] target_label: __param_target - source_labels: [__param_target ] target_label: instance - target_label: __address__ replacement: 10.170 .10 .79 :9115
windows_exporter config.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 collectors: enabled: "[defaults],memory,iis,process" collector: service: services-where: "Name='nginx' OR Name='Redis'" iis: site-whitelist: "Plat|PatLis" app-whitelist: "Plat|PatLis" process: whitelist: "(dbprobe-local|afw|Enjoyor|enjoyor|PushDataWorkService|Redis|nginx).*" log: level: warn telemetry: addr: ":9182" path: /metrics max-requests: 5
blackbox_exporter blackbox.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 modules: http_2xx: prober: http timeout: 60s http: method: GET preferred_ip_protocol: "ip4" no_follow_redirects: false tls_config: insecure_skip_verify: true http_post_2xx: prober: http http: method: POST preferred_ip_protocol: "ip4" no_follow_redirects: false tls_config: insecure_skip_verify: true headers: Content-Type: application/json body: '{}' tcp_connect: prober: tcp pop3s_banner: prober: tcp tcp: query_response: - expect: "^+OK" tls: true tls_config: insecure_skip_verify: false grpc: prober: grpc grpc: tls: true preferred_ip_protocol: "ip4" grpc_plain: prober: grpc grpc: tls: false service: "service1" ssh_banner: prober: tcp tcp: query_response: - expect: "^SSH-2.0-" - send: "SSH-2.0-blackbox-ssh-check" irc_banner: prober: tcp tcp: query_response: - send: "NICK prober" - send: "USER prober prober prober :prober" - expect: "PING :([^ ]+)" send: "PONG ${1}" - expect: "^:[^ ]+ 001" icmp: prober: icmp icmp_ttl5: prober: icmp timeout: 5s icmp: preferred_ip_protocol: "ip4" ttl: 5
Probe Type web application :
Probe_AppStatus_78.yml
1 2 3 - targets: - https://10.170.10.78:8466
database :
Probe_DBStatus_LIS.yml
1 2 3 - targets: - 10.110 .10 .110 :1433
webservice:
Probe_ServiceStatus_COLL.yml
1 2 3 - targets: - http://192.168.33.99:8060/WebService/WebServce.asmx
http POST WebAPI :
Probe_ServiceStatus_LIS_POST.yml
1 2 3 - targets: - http://192.134.24.120:8902/Interface/GetIndex
Grafana 配置访问协议和端口地址 custom.ini
1 2 3 4 5 6 7 [server] protocol = httpshttp_port = 7070
配置Grafana默认主题 custom.ini
1 2 3 4 5 default_theme = lightdefault_locale = zh-CN
Grafana SSL Config Grafana Config:
Custom.ini
1 2 3 4 5 6 7 8 [server] protocol = https...... cert_file = D:\A_Prometheus\grafana-oss\cert\Grafana.crtcert_key = D:\A_Prometheus\grafana-oss\cert\Grafana.key
如果是使用自建CA根证书签署Grafana自签名证书的,需要将根证书的CRT内容合并到Grafana的CRT中。
1 cat ROOTCA.crt > Grafana.crt
Grafana 通知模板配置 Grafana 自定义中文Alert模板
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{{ define "apiconvert" }}
{{ reReplaceAll "http://xxxxxxxx:8060/WebService/WebServceQZ.asmx" "[采]xxxxx医院【http://xxxxxxxx:8060/WebService/WebServceQZ.asmx】" .
| reReplaceAll "http://xxxxxxxxx:8060/WebService/WebServceQZ.asmx " "[采]xxxx医院【http://xxxxxxxx:8060/WebService/WebServceQZ.asmx】"
}}
{{ end }}
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{{ define "custom_alert" }}
**告警状态:** {{.Status }}
**告警类型:** {{ .Labels.alertname }}
**告警级别:** {{ .Labels.severity }}
**告警描述:**
{{ .Annotations.description }}
**告警详情:**
{{ .Annotations.summary }}
**故障时间:** {{ (.StartsAt).Format "2006-01-02 15:04:05" }}
{{ if eq .Status "resolved" }}
**恢复时间:** {{ (.EndsAt).Format "2006-01-02 15:04:05" }}
{{end }}
{{if gt (len .Labels.instance) 0 }}
**实例信息:**
{{/* {{ .Labels.instance }} */}}
{{ template "apiconvert" .Labels.instance}}
{{end }}
{{ if gt (len .DashboardURL ) 0 }}
**查看告警面板:** {{ .DashboardURL }}
{{ end }}
{{/** if gt (len .ValueString ) 0 **/}}
{{/**ValueString: {{ .ValueString }}**/}}
{{/** end **/}}
**更多信息:**
{{ if gt (len .SilenceURL ) 0 }}
暂停告警通知: {{ .SilenceURL }}
{{ end }}
{{ end }}
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{{ define "custom_message" }}
警报摘要:
{{ if gt (len .Alerts.Firing) 0 }}
{{ len .Alerts.Firing }} firing【警报触发中】:
==========异常告警==========
{{ range .Alerts.Firing}}{{ template "custom_alert" .}}{{end }}
============END============
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
{{ len .Alerts.Resolved }} resolved【已恢复】:
==========异常恢复==========
{{ range .Alerts.Resolved}}{{ template "custom_alert" .}}{{ end }}
============END============
{{ end }}
{{ end }}
Grafana面板 地址:https://github.com/YuanjianZhang/Grafana_Deploy