给非法公司做网站维护百度权重批量查询
1.问题背景
监控告警某套pg的两个备库延迟大于300s
2.pg备库延迟告警策略
通过postgres_exporter(version 0.13.2)监控生产pg,延迟告警策略如下:
pg_replication_lag_seconds{instance!~"host_ip_18801"} > 300 and ON(instance) pg_replication_is_replica == 1
3.问题分析
1)查看备库上pg_replication_lag_seconds指标值(确实大于告警值300s)
curl http://127.0.0.1:19187/metrics|grep pg_replication_lag_seconds% Total % Received % Xferd Average Speed Time Time Time CurrentDload Upload Total Spent Left Speed0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP pg_replication_lag_seconds Replication lag behind master in seconds
# TYPE pg_replication_lag_seconds gauge
pg_replication_lag_seconds 2059.643283
100 88926 0 88926 0 0 3503k 0 --:--:-- --:--:-- --:--:-- 3618k
2)查看主备库日志
pg主备库日志无任何异常
3)检查主库当前lsn及备库应用到的lsn
多次查看主库当前lsn不变,从库应用lsn无延迟,说明主库无数据写入
postgres=# SELECT * FROM pg_stat_replication \gx
-[ RECORD 1 ]----+------------------------------
pid | 26226
usesysid | 16384
usename | repl
application_name | xxx.xx.xx.59_18801
client_addr | xxx.xx.xx.59
client_hostname |
client_port | 60049
backend_start | 2023-08-04 09:39:42.059505+08
backend_xmin |
state | streaming
sent_lsn | 3633/F10891E8
write_lsn | 3633/F10891E8
flush_lsn | 3633/F10891E8
replay_lsn | 3633/F10891E8
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | quorum
reply_time | 2024-05-23 16:16:03.274172+08
-[ RECORD 2 ]----+------------------------------
pid | 15343
usesysid | 16384
usename | repl
application_name | xxx_xx_xx_65_18801
client_addr | xxx_xx_xx_65
client_hostname |
client_port | 53700
backend_start | 2023-08-05 04:30:26.862464+08
backend_xmin |
state | streaming
sent_lsn | 3633/F10891E8
write_lsn | 3633/F10891E8
flush_lsn | 3633/F10891E8
replay_lsn | 3633/F10891E8
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | quorum
reply_time | 2024-05-23 16:16:03.273835+08postgres=#
postgres=#
postgres=# select pg_current_wal_lsn();pg_current_wal_lsn
--------------------3633/F10891E8
(1 row)
4.查看对应版本postgres_exporter源码中如何监控从库延迟
查看对应版本postgres_exporter源码中如何监控从库延迟
https://github.com/prometheus-community/postgres_exporter/commit/0b6d9860abb2340412a15be1fc2b4d25e154b8c3
5.问题解决
升级postgres_exporter到0.14或以上版本