PostgreSQL VACUUM 之深入浅出 (二)

PostgreSQL VACUUM 之深入浅出 (二)AUTOVACUUM AUTOVACUUM 简介 PostgreSQL 提供了 AUTOVACUUM 的机制。 autovacuum 不仅会自动进行 VACUUM,也会自动进行 ANALYZE,以分析

PostgreSQL VACUUM 之深入浅出 (二)

AUTOVACUUM

AUTOVACUUM 简介

PostgreSQL 提供了 AUTOVACUUM 的机制。

autovacuum 不仅会自动进行 VACUUM,也会自动进行 ANALYZE,以分析统计信息用于执行计划。

在 postgresql.conf 中,autovacuum 参数已默认打开。

autovacuum = on

autovacuum 打开后,会有一个 autovacuum launcher 进程

$ ps -ef|grep postgres|grep autovacuum|grep -v grep
postgres 28398 28392  0 Nov13 ?        00:00:19 postgres: autovacuum launcher  

pg_stat_activity 也可以看到 backend_type 为 autovacuum launcher 的连接:

psql -d alvindb -U postgres
alvindb=# x
Expanded display is on.
alvindb=# SELECT * FROM pg_stat_activity WHERE backend_type = "autovacuum launcher";
-[ RECORD 1 ]----+------------------------------
datid            | 
datname          | 
pid              | 28398
usesysid         | 
usename          | 
application_name | 
client_addr      | 
client_hostname  | 
client_port      | 
backend_start    | 2021-11-13 23:18:00.406618+08
xact_start       | 
query_start      | 
state_change     | 
wait_event_type  | Activity
wait_event       | AutoVacuumMain
state            | 
backend_xid      | 
backend_xmin     | 
query            | 
backend_type     | autovacuum launcher

那么 AUTOVACUUM 多久运行一次?

autovacuum launcher 会每隔 autovacuum_naptime ,创建 autovacuum worker,检查是否需要做 autovacuum。

psql -d alvindb -U postgres
alvindb=# SELECT * FROM pg_stat_activity WHERE backend_type = "autovacuum worker";
-[ RECORD 1 ]----+------------------------------
datid            | 13220
datname          | postgres
pid              | 32457
usesysid         | 
usename          | 
application_name | 
client_addr      | 
client_hostname  | 
client_port      | 
backend_start    | 2021-11-06 23:32:53.880281+08
xact_start       | 
query_start      | 
state_change     | 
wait_event_type  | 
wait_event       | 
state            | 
backend_xid      | 
backend_xmin     | 
query            | 
backend_type     | autovacuum worker

autovacuum_naptime 默认为 1min:

#autovacuum_naptime = 1min		# time between autovacuum runs

autovacuum 又是根据什么标准决定是否进行 VACUUM 和 ANALYZE 呢?

当 autovacuum worker 检查到,

dead tuples 大于 vacuum threshold 时,会自动进行 VACUUM。

vacuum threshold 公式如下:

vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples

增删改的行数据大于 analyze threshold 时,会自动进行 ANALYZE。

analyze threshold 公式如下:

analyze threshold = analyze base threshold + analyze scale factor * number of tuples

对应 postgresql.conf 中相关参数如下:

#autovacuum_vacuum_threshold = 50       # min number of row updates before vacuum
#autovacuum_analyze_threshold = 50      # min number of row updates before analyze
#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze

dead tuples 为 pg_stat_user_tables.n_dead_tup(Estimated number of dead rows)

alvindb=> SELECT * FROM pg_stat_user_tables WHERE schemaname = "alvin" AND relname = "tb_test_vacuum";
-[ RECORD 1 ]-------+---------------
relid               | 37409
schemaname          | alvin
relname             | tb_test_vacuum
seq_scan            | 2
seq_tup_read        | 0
idx_scan            | 0
idx_tup_fetch       | 0
n_tup_ins           | 0
n_tup_upd           | 0
n_tup_del           | 0
n_tup_hot_upd       | 0
n_live_tup          | 0
n_dead_tup          | 0
n_mod_since_analyze | 0
last_vacuum         | 
last_autovacuum     | 
last_analyze        | 
last_autoanalyze    | 
vacuum_count        | 0
autovacuum_count    | 0
analyze_count       | 0
autoanalyze_count   | 0

那么 number of tuples 是哪个列的值?是 pg_stat_user_tables.n_live_tup(Estimate number of live rows)?还是实际的 count 值?

其实是 pg_class.reltuples (Estimate number of live rows in the table used by the planner)。

alvindb=> SELECT u.schemaname,u.relname,c.reltuples,u.n_live_tup,u.n_mod_since_analyze,u.n_dead_tup,u.last_autoanalyze,u.last_autovacuum
FROM
    pg_stat_user_tables u, pg_class c, pg_namespace n
WHERE n.oid = c.relnamespace
    AND c.relname = u.relname
    AND n.nspname = u.schemaname
    AND u.schemaname = "alvin"
    AND u.relname = "tb_test_vacuum"
-[ RECORD 1 ]-------+---------------
schemaname          | alvin
relname             | tb_test_vacuum
reltuples           | 0
n_live_tup          | 0
n_mod_since_analyze | 0
n_dead_tup          | 0
last_autoanalyze    | 
last_autovacuum     | 

所以 AUTO VACUUM 具体公式如下:

pg_stat_user_tables.n_dead_tup > autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * pg_class.reltuples

同理,AUTO ANALYZE 具体公式如下:

pg_stat_user_tables.n_mod_since_analyze > autovacuum_analyze_threshold + autovacuum_analyze_scale_factor * pg_class.reltuples

精准触发 AUTOVACUUM

下面实测一下 autovacuum。为了测试方便,autovacuum_naptime 临时修改为 5s,这样触发了临界条件,只需要等 5s 就能看到效果,而不是等 1min。

修改参数如下:

autovacuum_naptime = 5s
autovacuum_vacuum_threshold = 100       # min number of row updates before vacuum
autovacuum_analyze_threshold = 100      # min number of row updates before analyze
autovacuum_vacuum_scale_factor = 0.2    # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.1   # fraction of table size before analyze

接下来通过一步一步测试,精准触发 autovacuum。

为了方便测试,通过如下 AUTOVACUUM 计算 SQL 计算需要删除或修改的数据行数。

alvindb=> WITH v AS (
  SELECT * FROM
    (SELECT setting AS autovacuum_vacuum_scale_factor FROM pg_settings WHERE name = "autovacuum_vacuum_scale_factor") vsf,
    (SELECT setting AS autovacuum_vacuum_threshold FROM pg_settings WHERE name = "autovacuum_vacuum_threshold") vth,
    (SELECT setting AS autovacuum_analyze_scale_factor FROM pg_settings WHERE name = "autovacuum_analyze_scale_factor") asf,
    (SELECT setting AS autovacuum_analyze_threshold FROM pg_settings WHERE name = "autovacuum_analyze_threshold") ath
),
t AS (
    SELECT
        c.reltuples,u.*
    FROM
        pg_stat_user_tables u, pg_class c, pg_namespace n
    WHERE n.oid = c.relnamespace
        AND c.relname = u.relname
        AND n.nspname = u.schemaname
        AND u.schemaname = "alvin"
        AND u.relname = "tb_test_vacuum"
)
SELECT
    schemaname,
    relname,
    autovacuum_vacuum_scale_factor,
    autovacuum_vacuum_threshold,
    autovacuum_analyze_scale_factor,
    autovacuum_analyze_threshold,
    n_live_tup,
    reltuples,
    autovacuum_analyze_trigger,
    n_mod_since_analyze,
    autovacuum_analyze_trigger - n_mod_since_analyze AS rows_to_mod_before_auto_analyze,
    last_autoanalyze,
    autovacuum_vacuum_trigger,
    n_dead_tup,
    autovacuum_vacuum_trigger - n_dead_tup AS rows_to_delete_before_auto_vacuum,
    last_autovacuum
FROM (
    SELECT
        schemaname,
        relname,
        autovacuum_vacuum_scale_factor,
        autovacuum_vacuum_threshold,
        autovacuum_analyze_scale_factor,
        autovacuum_analyze_threshold,
        floor(autovacuum_analyze_scale_factor::numeric * reltuples) + 1 + autovacuum_analyze_threshold::int AS autovacuum_analyze_trigger,
        floor(autovacuum_vacuum_scale_factor::numeric * reltuples) + 1 + autovacuum_vacuum_threshold::int AS autovacuum_vacuum_trigger,
        reltuples,
        n_live_tup,
        n_dead_tup,
        n_mod_since_analyze,
        last_autoanalyze,
        last_autovacuum
    FROM
        v,
        t) a;
-[ RECORD 1 ]---------------------+---------------
schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 0
reltuples                         | 0
autovacuum_analyze_trigger        | 101
n_mod_since_analyze               | 0
rows_to_mod_before_auto_analyze   | 101
last_autoanalyze                  | 
autovacuum_vacuum_trigger         | 101
n_dead_tup                        | 0
rows_to_delete_before_auto_vacuum | 101
last_autovacuum                   | 

根据计算公式,

pg_stat_user_tables.n_mod_since_analyze > 100 + 0.1 * 0

即当修改的行数大于 100,即为 101 时,将触发 AUTO ANALYZE。

先插入 100 行数据,

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:45:57.669183+08
(1 row)
alvindb=> INSERT INTO tb_test_vacuum(test_num) SELECT gid FROM generate_series(1,100,1) gid;
INSERT 0 100

此时,通过如下计算可以看到,再更新 1 行,将触发 AUTO ANALYZE。

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 100
reltuples                         | 0
autovacuum_analyze_trigger        | 101
n_mod_since_analyze               | 100
rows_to_mod_before_auto_analyze   | 1
last_autoanalyze                  | 
autovacuum_vacuum_trigger         | 101
n_dead_tup                        | 0
rows_to_delete_before_auto_vacuum | 101
last_autovacuum                   | 

此时,统计信息为空:

alvindb=> SELECT * FROM pg_stats WHERE schemaname = "alvin" AND tablename = "tb_test_vacuum";
(0 rows)

现在插入最后一条数据,

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:46:31.034422+08
(1 row)
alvindb=> INSERT INTO tb_test_vacuum(test_num) SELECT gid FROM generate_series(101,101,1) gid;
INSERT 0 1

执行 AUTOVACUUM 计算 SQL, 可以看到,已触发 AUTO ANALYZE:

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 101
reltuples                         | 101
autovacuum_analyze_trigger        | 111
n_mod_since_analyze               | 0
rows_to_mod_before_auto_analyze   | 111
last_autoanalyze                  | 2021-11-06 20:46:39.88796+08
autovacuum_vacuum_trigger         | 121
n_dead_tup                        | 0
rows_to_delete_before_auto_vacuum | 121
last_autovacuum                   | 

可以看到表 tb_test_vacuum 统计信息已更新:

alvindb=> SELECT * FROM pg_stats WHERE schemaname = "alvin" AND tablename = "tb_test_vacuum";

PostgreSQL VACUUM 之深入浅出 (二)

查看 PostgreSQL 日志,可以看到

[    2021-11-06 20:46:39.887 CST 6816 6186792f.1aa0 1 3/173948 13179359]LOG:  automatic analyze of table "alvindb.alvin.tb_test_vacuum" system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s

PostgreSQL 日志中是否记录 AUTOVACUUM 由参数 log_autovacuum_min_duration 控制,默认关闭。

#log_autovacuum_min_duration = -1	# -1 disables, 0 logs all actions and
					# their durations, > 0 logs only
					# actions running at least this number
					# of milliseconds.

可将该参数改为 0,即记录所有的 AUTOVACUUM 操作。

log_autovacuum_min_duration = 0

AUTOVACUUM 计算 SQL 的执行结果得知,再修改 111 行将触发 AUTO ANALYZE。

rows_to_mod_before_auto_analyze   | 111
rows_to_delete_before_auto_vacuum | 121

先修改 110 行,并 sleep 6s。

alvindb=> SELECT clock_timestamp();
       clock_timestamp        
------------------------------
 2021-11-06 20:47:30.75553+08
(1 row)
alvindb=> INSERT INTO tb_test_vacuum(test_num) SELECT gid FROM generate_series(102,111,1) gid;
INSERT 0 10
alvindb=> UPDATE tb_test_vacuum SET test_num = test_num WHERE test_num <= 100;
UPDATE 100
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:47:43.465651+08
(1 row)

AUTOVACUUM 计算 SQL 的执行结果得知,修改后 110 行并 sleep 6s (前面已将 autovacuum_naptime 设置成了 5s)后,AUTO ANALYZE 并未触发。

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 111
reltuples                         | 101
autovacuum_analyze_trigger        | 111
n_mod_since_analyze               | 110
rows_to_mod_before_auto_analyze   | 1
last_autoanalyze                  | 2021-11-06 20:46:39.88796+08
autovacuum_vacuum_trigger         | 121
n_dead_tup                        | 100
rows_to_delete_before_auto_vacuum | 21
last_autovacuum                   | 

再修改 1 行预计将触发 AUTO ANALYZE。此时删除一行:

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:47:55.746411+08
(1 row)
alvindb=> DELETE FROM tb_test_vacuum WHERE test_id = 111;
DELETE 1
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:48:01.796389+08
(1 row)

AUTOVACUUM 计算 SQL 的查询结果中的 last_autoanalyze 得知,已精准触发 AUTO ANALYZE。

并且从 rows_to_delete_before_auto_vacuum 得知,预计删除 22 行后,将触发 AUTO VACUUM。

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 110
reltuples                         | 110
autovacuum_analyze_trigger        | 112
n_mod_since_analyze               | 0
rows_to_mod_before_auto_analyze   | 112
last_autoanalyze                  | 2021-11-06 20:48:04.928899+08
autovacuum_vacuum_trigger         | 123
n_dead_tup                        | 101
rows_to_delete_before_auto_vacuum | 22
last_autovacuum                   | 

先删除 (UPDATE = DELETE + INSERT) 21 行:

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:48:32.313706+08
(1 row)

alvindb=> UPDATE tb_test_vacuum SET test_num = test_num WHERE test_num <= 21;
UPDATE 21
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:48:38.454997+08
(1 row)

AUTOVACUUM 计算 SQL 的查询结果中的 last_autovacuum 得知,还未触发 AUTO VACUUM。

并且从 rows_to_delete_before_auto_vacuum 得知,预计删除 1 行后,将触发 AUTO VACUUM。

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 110
reltuples                         | 110
autovacuum_analyze_trigger        | 112
n_mod_since_analyze               | 21
rows_to_mod_before_auto_analyze   | 91
last_autoanalyze                  | 2021-11-06 20:48:04.928899+08
autovacuum_vacuum_trigger         | 123
n_dead_tup                        | 122
rows_to_delete_before_auto_vacuum | 1
last_autovacuum                   | 

此时删除一行

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:48:39.174009+08
(1 row)

alvindb=> DELETE FROM tb_test_vacuum WHERE test_id = 110;
DELETE 1
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 20:48:45.213537+08
(1 row)

AUTOVACUUM 计算 SQL 的查询结果中的 last_autovacuum 得知,已精准触发 AUTO VACUUM!

schemaname                        | alvin
relname                           | tb_test_vacuum
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 100
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 100
n_live_tup                        | 109
reltuples                         | 109
autovacuum_analyze_trigger        | 111
n_mod_since_analyze               | 22
rows_to_mod_before_auto_analyze   | 89
last_autoanalyze                  | 2021-11-06 20:48:04.928899+08
autovacuum_vacuum_trigger         | 122
n_dead_tup                        | 0
rows_to_delete_before_auto_vacuum | 122
last_autovacuum                   | 2021-11-06 20:48:49.914345+08

查看 PostgreSQL 日志,可以看到

[    2021-11-06 20:48:49.914 CST 7207 618679b1.1c27 1 3/174162 0]LOG:  automatic vacuum of table "alvindb.alvin.tb_test_vacuum": index scans: 1
	pages: 0 removed, 1 remain, 0 skipped due to pins, 0 skipped frozen
	tuples: 123 removed, 109 remain, 0 are dead but not yet removable, oldest xmin: 13179371
	buffer usage: 59 hits, 4 misses, 4 dirtied
	avg read rate: 121.832 MB/s, avg write rate: 121.832 MB/s
	system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
	buffer usage: 59 hits, 4 misses, 4 dirtied
	avg read rate: 121.832 MB/s, avg write rate: 121.832 MB/s
	system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s

那么问题来了,autovacuum_vacuum_scale_factor 为 0.2 对于所有的表都合适吗?1 亿数据量的表有 2000 万 dead tuples 以上才会触发 AUTO VACUUM,这意味着表越大越不容易触发 AUTO VACUUM。怎么可以解决这个问题呢?

精准触发表级 AUTOVACUUM

可以根据需要,在表上设置合理的 autovacuum_vacuum_scale_factor。对于大表,可以设置小点的 autovacuum_vacuum_scale_factor,如 0.1。

下面带你一步一步设置并精确触发表级的 AUTO ANALYZE 和 AUTO VACUUM。

这次将采用大一点的数据量进行测试。考虑到手动创建表,插入数据等比较麻烦,接下来测试利用 PostgreSQL 自带的工具 pgbench。

使用 pgbench 创建 10 万行数据的测试表:

$ pgbench -i alvindb
dropping old tables...
creating tables...
generating data...
100000 of 100000 tuples (100%) done (elapsed 0.38 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done.

修改表级参数:

alvindb=> ALTER TABLE pgbench_accounts SET (autovacuum_vacuum_scale_factor = 0.1, autovacuum_vacuum_threshold = 2000);
ALTER TABLE
alvindb=> ALTER TABLE pgbench_accounts SET (autovacuum_analyze_scale_factor = 0.05, autovacuum_analyze_threshold = 2000);
ALTER TABLE

按照之前 AUTOVACUUM 计算 SQL ,可知要修改 11001 行才会触发 AUTO ANALYZE, 要有约 21001 个 dead tuples 才会触发 AUTO VACUUM。

schemaname                        | public
relname                           | pgbench_accounts
autovacuum_vacuum_scale_factor    | 0.2
autovacuum_vacuum_threshold       | 1000
autovacuum_analyze_scale_factor   | 0.1
autovacuum_analyze_threshold      | 1000
n_live_tup                        | 100000
reltuples                         | 100000
autovacuum_analyze_trigger        | 11001
n_mod_since_analyze               | 0
rows_to_mod_before_auto_analyze   | 11001
last_autoanalyze                  | 
autovacuum_vacuum_trigger         | 21001
n_dead_tup                        | 0
rows_to_delete_before_auto_vacuum | 21001
last_autovacuum                   | 

现在设置了表级的参数以后,从如下 表级 AUTOVACUUM 计算 SQL ,可知修改 7001 行就可以触发 AUTO ANALYZE, 有约 12001 个 dead tuples 就可以触发 AUTO VACUUM。更重要的是,表级的 AUTOVACUUM 参数不会对其他表产生影响,只对已设置的表有效,也可以对不同大小的表设置不同的参数,还可以随时调整!

表级 AUTOVACUUM 计算 SQL

alvindb=> WITH v AS (
SELECT (SELECT split_part(x, "=", 2) FROM unnest(c.reloptions) q (x) WHERE x ~ "^autovacuum_vacuum_scale_factor=" ) as autovacuum_vacuum_
scale_factor,
    (SELECT split_part(x, "=", 2) FROM unnest(c.reloptions) q (x) WHERE x ~ "^autovacuum_vacuum_threshold=" ) as autovacuum_vacuum_thresh
old,
    (SELECT split_part(x, "=", 2) FROM unnest(c.reloptions) q (x) WHERE x ~ "^autovacuum_analyze_scale_factor=" ) as autovacuum_analyze_s
cale_factor,
    (SELECT split_part(x, "=", 2) FROM unnest(c.reloptions) q (x) WHERE x ~ "^autovacuum_analyze_threshold=" ) as autovacuum_analyze_thre
shold
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname IN ("public")
AND c.relname = "pgbench_accounts"
),
t AS (
    SELECT
        c.reltuples,u.*
    FROM
        pg_stat_user_tables u, pg_class c, pg_namespace n
    WHERE n.oid = c.relnamespace
        AND c.relname = u.relname
        AND n.nspname = u.schemaname
        AND u.schemaname = "public"
        AND u.relname = "pgbench_accounts"
)
SELECT
    schemaname,
    relname,
    autovacuum_vacuum_scale_factor,
    autovacuum_vacuum_threshold,
    autovacuum_analyze_scale_factor,
    autovacuum_analyze_threshold,
    n_live_tup,
    reltuples,
    autovacuum_analyze_trigger,
    n_mod_since_analyze,
    autovacuum_analyze_trigger - n_mod_since_analyze AS rows_to_mod_before_analyze,
    last_autoanalyze,
    autovacuum_vacuum_trigger,
    n_dead_tup,
    autovacuum_vacuum_trigger - n_dead_tup AS rows_to_delete_before_vacuum,
    last_autovacuum
FROM (
    SELECT
        schemaname,
        relname,
        autovacuum_vacuum_scale_factor,
        autovacuum_vacuum_threshold,
        autovacuum_analyze_scale_factor,
        autovacuum_analyze_threshold,
        floor(autovacuum_analyze_scale_factor::numeric * reltuples) + 1 + autovacuum_analyze_threshold::int AS autovacuum_analyze_trigger,
        floor(autovacuum_vacuum_scale_factor::numeric * reltuples) + 1 + autovacuum_vacuum_threshold::int AS autovacuum_vacuum_trigger,
        reltuples,
        n_live_tup,
        n_dead_tup,
        n_mod_since_analyze,
        last_autoanalyze,
        last_autovacuum
    FROM
        v,
        t) a;
-[ RECORD 1 ]-------------------+-----------------
schemaname                      | public
relname                         | pgbench_accounts
autovacuum_vacuum_scale_factor  | 0.1
autovacuum_vacuum_threshold     | 2000
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold    | 2000
n_live_tup                      | 100000
reltuples                       | 100000
autovacuum_analyze_trigger      | 7001
n_mod_since_analyze             | 0
rows_to_mod_before_analyze      | 7001
last_autoanalyze                | 
autovacuum_vacuum_trigger       | 12001
n_dead_tup                      | 0
rows_to_delete_before_vacuum    | 12001
last_autovacuum                 | 

现在已预测到要修改的行数,接下来一步一步来触发一下表级的 AUTO ANALYZE 和 AUTO VACUUM。

先删除 7000 行数据:

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:03.252622+08
(1 row)
alvindb=> DELETE FROM pgbench_accounts WHERE aid<=7000;
DELETE 7000
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:09.363536+08
(1 row)

根据表级 AUTOVACUUM 计算 SQL 执行结果的 rows_to_mod_before_analyze 得知,再修改 1 行将触发 AUTO ANALYZE:

schemaname                      | public
relname                         | pgbench_accounts
autovacuum_vacuum_scale_factor  | 0.1
autovacuum_vacuum_threshold     | 2000
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold    | 2000
n_live_tup                      | 93000
reltuples                       | 100000
autovacuum_analyze_trigger      | 7001
n_mod_since_analyze             | 7000
rows_to_mod_before_analyze      | 1
last_autoanalyze                | 
autovacuum_vacuum_trigger       | 12001
n_dead_tup                      | 7000
rows_to_delete_before_vacuum    | 5001
last_autovacuum                 | 

再修改 1 行:

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:30.649717+08
(1 row)
alvindb=> UPDATE pgbench_accounts SET bid = bid WHERE aid=7001;
UPDATE 1
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:36.705928+08
(1 row)

根据表级 AUTOVACUUM 计算 SQL 执行结果的 last_autoanalyze 得知,已精准触发 AUTO ANALYZE!

schemaname                      | public
relname                         | pgbench_accounts
autovacuum_vacuum_scale_factor  | 0.1
autovacuum_vacuum_threshold     | 2000
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold    | 2000
n_live_tup                      | 93000
reltuples                       | 93000
autovacuum_analyze_trigger      | 6651
n_mod_since_analyze             | 0
rows_to_mod_before_analyze      | 6651
last_autoanalyze                | 2021-11-06 23:33:40.87317+08
autovacuum_vacuum_trigger       | 11301
n_dead_tup                      | 7001
rows_to_delete_before_vacuum    | 4300
last_autovacuum                 | 

从 PostgreSQL 日志中也可以看到 AUTO ANALYZE 被触发了:

[    2021-11-06 23:33:40.873 CST 32646 6186a054.7f86 1 6/1393 13179750]LOG:  automatic analyze of table "alvindb.public.pgbench_accounts" syst
em usage: CPU: user: 0.04 s, system: 0.03 s, elapsed: 0.11 s

并且,根据 rows_to_delete_before_vacuum 得知,再删除 4300 行就可以触发 AUTO VACUUM。

接下来先删除 4299 行,以测试临界值:

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:43.867176+08
(1 row)
alvindb=> UPDATE pgbench_accounts SET bid = bid WHERE aid>=95702;
UPDATE 4299
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:50.016447+08
(1 row)

autovacuum_naptime 为 5s,此时并未触发 AUTO VACUUM。

schemaname                      | public
relname                         | pgbench_accounts
autovacuum_vacuum_scale_factor  | 0.1
autovacuum_vacuum_threshold     | 2000
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold    | 2000
n_live_tup                      | 93000
reltuples                       | 93000
autovacuum_analyze_trigger      | 6651
n_mod_since_analyze             | 4299
rows_to_mod_before_analyze      | 2352
last_autoanalyze                | 2021-11-06 23:33:40.87317+08
autovacuum_vacuum_trigger       | 11301
n_dead_tup                      | 11300
rows_to_delete_before_vacuum    | 1
last_autovacuum                 | 

再删除 (UPDATE = DELETE + INSERT) 1 行 :

alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:53.326483+08
(1 row)
alvindb=> UPDATE pgbench_accounts SET bid = bid WHERE aid=7002;
UPDATE 1
alvindb=> SELECT pg_sleep(6);
 pg_sleep 
----------
 
(1 row)
alvindb=> SELECT clock_timestamp();
        clock_timestamp        
-------------------------------
 2021-11-06 23:33:59.439375+08
(1 row)

从如下结果中的 last_autovacuum 得知,此时已精确触发 AUTO VACUUM!

schemaname                      | public
relname                         | pgbench_accounts
autovacuum_vacuum_scale_factor  | 0.1
autovacuum_vacuum_threshold     | 2000
autovacuum_analyze_scale_factor | 0.05
autovacuum_analyze_threshold    | 2000
n_live_tup                      | 93000
reltuples                       | 93000
autovacuum_analyze_trigger      | 6651
n_mod_since_analyze             | 4300
rows_to_mod_before_analyze      | 2351
last_autoanalyze                | 2021-11-06 23:33:40.87317+08
autovacuum_vacuum_trigger       | 11301
n_dead_tup                      | 0
rows_to_delete_before_vacuum    | 11301
last_autovacuum                 | 2021-11-06 23:34:00.956936+08

从 PostgreSQL 日志中也可以看到 AUTO VACUUM 被触发了:

[    2021-11-06 23:34:00.956 CST 32710 6186a068.7fc6 1 6/1455 0]LOG:  automatic vacuum of table "alvindb.public.pgbench_accounts": index scans
: 1
        pages: 0 removed, 421 remain, 0 skipped due to pins, 0 skipped frozen
        tuples: 2 removed, 93000 remain, 0 are dead but not yet removable, oldest xmin: 13179755
        buffer usage: 967 hits, 60 misses, 7 dirtied
        avg read rate: 10.067 MB/s, avg write rate: 1.174 MB/s
        system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.18 s

公众号

关注 DBA Daily 公众号,第一时间收到文章的更新。
通过一线 DBA 的日常工作,学习实用数据库技术干货!
PostgreSQL VACUUM 之深入浅出 (二)

公众号优质文章推荐

PostgreSQL VACUUM 之深入浅出

华山论剑之 PostgreSQL sequence

[PG Upgrade Series] Extract Epoch Trap

[PG Upgrade Series] Toast Dump Error

GitLab supports only PostgreSQL now

MySQL or PostgreSQL?

PostgreSQL hstore Insight

ReIndex 失败原因调查

PG 数据导入 Hive 乱码问题调查

PostGIS 扩展创建失败原因调查

原文地址:https://www.cnblogs.com/dbadaily/archive/2022/02/25/vacuum2.html

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
转载请注明出处: https://daima100.com/5484.html

(0)
上一篇 2023-05-07
下一篇 2023-05-07

相关推荐

  • clickhouse可视化_clickhouse hdfs

    clickhouse可视化_clickhouse hdfs行业动态 ClickHouse Cloud正式GA,同时,B轮融资得到进一步增加 这是由ClickHouse官方推出云服务,启用新域名:clickhouse.cloud ,类似于MongoDB的Atl

    2023-06-19
    178
  • Python数值处理:快速高效地进行数字计算

    Python数值处理:快速高效地进行数字计算Python作为一种高级编程语言,在数据科学领域表现出了强大的计算能力。Python作为一种动态的解释型语言,有很多便于快速开发的库可供使用,例如NumPy、SciPy和Pandas等。这些库用Python编写,主要用于进行快速的矢量化计算,以及对大规模数据进行高效处理。

    2024-04-04
    77
  • Hbase架构剖析「建议收藏」

    Hbase架构剖析「建议收藏」HBase隶属于hadoop生态系统,它参考了谷歌的BigTable建模,实现的编程语言为 Java, 建立在hdfs之上,提供高可靠性、高性能、列存储、可伸缩、实时读写的数据库系统。它仅能通过主键(

    2022-12-28
    152
  • mysql零基础入门教程完整_MySQL入门

    mysql零基础入门教程完整_MySQL入门最开始入门学习MySQL的时候,连数据库是什么都不知道,后来在网上各种搜MySQL数据库的学习教程,才开始慢慢了解它。 MySQL 是最流行的关系型数据库管理系统,在 WEB 应用方面 MySQL 是

    2023-04-21
    143
  • 用Python绘制散点图

    用Python绘制散点图散点图是一种常见的数据可视化方式,在数据分析、机器学习等领域都有着广泛的应用。Python作为一种强大的编程语言,本身就集成了各种绘图库,可以轻松地绘制各种类型的图表,包括散点图。

    2024-07-07
    43
  • mysql安装包安装教程_安装MySQL

    mysql安装包安装教程_安装MySQLLinux使用MySQL Yum存储库上安装MySQL 5.7,适用于Oracle Linux,Red Hat Enterprise Linux和CentOS系统。 1、添加MySQL Yum存储库

    2022-12-26
    146
  • LeetCode Python练习题

    LeetCode Python练习题LeetCode是一个程序员学习、提高算法能力的网站。网站包含了各种难度和类型的编程题目,涉及各种数据结构和算法,并提供在线代码编辑器供用户提交代码测试。通过掌握LeetCode上的编程题目,能够帮助程序员提高算法思维和编程能力。

    2024-07-12
    43
  • 麒麟985处理器怎么样?[通俗易懂]

    麒麟985处理器怎么样?[通俗易懂]  最近上网时,发现很多朋友对于荣耀30系列首发的麒麟985处理器非常感兴趣,作为一个从事手机行业很多年的人士,今天就在这里和大家聊一聊,荣耀30系列首发的麒麟985处理器究竟怎么样?   首先在5…

    2023-02-26
    141

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注