大家好,我是考100分的小小码 ,祝大家学习进步,加薪顺利呀。今天说一说递归查询两种写法与性能差异「建议收藏」,希望您对编程的造诣更进一步.
对于递归查询,KINGBASE用户可以选择使用connect by ,或者使用 with recursive 。下面,我们以例子来看下二者的差别。
一、构造数据
create table test_recursive(id integer,pid integer,name varchar,description text); insert into test_recursive(id,name,description) select generate_series(1,100000),"a"||generate_series(1,100000),repeat("desc",500); update test_recursive set pid=1 where id between 2 and 10; update test_recursive set pid=mod(id,9)+2 where id between 11 and 100; update test_recursive set pid=mod(id,90)+11 where id between 101 and 1000; update test_recursive set pid=mod(id,900)+101 where id between 1001 and 10000; update test_recursive set pid=mod(id,9000)+1001 where id between 10001 and 100000; create table test_recursive_random(id integer,pid integer,name varchar,description text); insert into test_recursive_random select * from test_recursive order by random; create index ind_test_recursive_random_id on test_recursive_random(id); create index ind_test_recursive_random_pid on test_recursive_random(pid); vacuum full test_recursive_random; analyze test_recursive_random; create index ind_test_recursive_id on test_recursive(id); create index ind_test_recursive_pid on test_recursive(pid); vacuum full test_recursive; analyze test_recursive;
本例子构造了5层的数据,有排序与非排序两种数据。
二、使用Connect By
connect by的查询性能:用时 746ms
test=# explain analyze select id,pid,name from test_recursive start with id=1 connect by prior id = pid ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Recursive Union (cost=0.29..422.37 rows=101 width=14) (actual time=0.038..728.281 rows=100000 loops=1) -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.015..0.017 rows=1 loops=1) Index Cond: (id = 1) -> Nested Loop (cost=0.42..41.30 rows=10 width=14) (actual time=0.002..0.003 rows=1 loops=100000) -> WorkTable Scan on "connect" (cost=0.00..0.02 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=100000) -> Index Scan using ind_test_recursive_pid on test_recursive (cost=0.42..41.18 rows=10 width=14) (actual time=0.002..0.002 rows=1 loops=100000) Index Cond: (pid = (PRIOR test_recursive.id)) Planning Time: 0.185 ms Execution Time: 746.102 ms (9 rows)
三、KINGBASE With Recursive 查询
1、排序数据:用时302ms
explain analyze with recursive tmp1 as ( select id,pid,name from test_recursive where id=1 union all select a.id,a.pid,a.name from test_recursive a inner join tmp1 b on a.pid=b.id ) select * from tmp1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=4013.94..4033.96 rows=1001 width=40) (actual time=0.020..297.856 rows=100000 loops=1) CTE tmp1 -> Recursive Union (cost=0.29..4013.94 rows=1001 width=14) (actual time=0.018..257.298 rows=100000 loops=1) -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.016..0.018 rows=1 loops=1) Index Cond: (id = 1) -> Nested Loop (cost=0.42..398.56 rows=100 width=14) (actual time=20.529..38.777 rows=16666 loops=6) -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.003..2.150 rows=16667 loops=6) -> Index Scan using ind_test_recursive_pid on test_recursive a (cost=0.42..39.74 rows=10 width=14) (actual time=0.001..0.002 rows=1 loops=100000) Index Cond: (pid = b.id) Planning Time: 0.207 ms Execution Time: 302.244 ms (11 rows)
2、非排序数据:440ms
test=# explain analyze with recursive tmp1 as ( test(# select id,pid,name from test_recursive_random where id=1 test(# union all test(# select a.id,a.pid,a.name from test_recursive_random a inner join tmp1 b on a.pid=b.id ) test-# select * from tmp1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=4206.87..4226.89 rows=1001 width=40) (actual time=0.020..434.721 rows=100000 loops=1) CTE tmp1 -> Recursive Union (cost=0.29..4206.87 rows=1001 width=14) (actual time=0.018..397.456 rows=100000 loops=1) -> Index Scan using ind_test_recursive_random_id on test_recursive_random (cost=0.29..8.31 rows=1 width=14) (actual time=0.017..0.018 rows=1 loops=1) Index Cond: (id = 1) -> Nested Loop (cost=4.50..417.85 rows=100 width=14) (actual time=33.080..62.311 rows=16666 loops=6) -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.007..2.412 rows=16667 loops=6) -> Bitmap Heap Scan on test_recursive_random a (cost=4.50..41.67 rows=10 width=14) (actual time=0.002..0.003 rows=1 loops=100000) Recheck Cond: (pid = b.id) Heap Blocks: exact=99557 -> Bitmap Index Scan on ind_test_recursive_random_pid (cost=0.00..4.49 rows=10 width=0) (actual time=0.001..0.001 rows=1 loops=100000) Index Cond: (pid = b.id) Planning Time: 0.304 ms Execution Time: 439.563 ms (14 rows)
3、使用hash join:260ms
test=# set enable_nestloop=off; SET test=# explain analyze with recursive tmp1 as ( test(# select id,pid,name from test_recursive where id=1 test(# union all test(# select a.id,a.pid,a.name from test_recursive a inner join tmp1 b on a.pid=b.id ) test-# select * from tmp1; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on tmp1 (cost=24101.58..24121.60 rows=1001 width=40) (actual time=0.018..255.766 rows=100000 loops=1) CTE tmp1 -> Recursive Union (cost=0.29..24101.58 rows=1001 width=14) (actual time=0.016..218.427 rows=100000 loops=1) -> Index Scan using ind_test_recursive_id on test_recursive (cost=0.29..8.31 rows=1 width=14) (actual time=0.015..0.017 rows=1 loops=1) Index Cond: (id = 1) -> Hash Join (cost=0.33..2407.32 rows=100 width=14) (actual time=13.828..32.571 rows=16666 loops=6) Hash Cond: (a.pid = b.id) -> Seq Scan on test_recursive a (cost=0.00..2031.00 rows=100000 width=14) (actual time=0.005..8.240 rows=100000 loops=6) -> Hash (cost=0.20..0.20 rows=10 width=4) (actual time=5.114..5.114 rows=16667 loops=6) Buckets: 131072 (originally 1024) Batches: 2 (originally 1) Memory Usage: 3073kB -> WorkTable Scan on tmp1 b (cost=0.00..0.20 rows=10 width=4) (actual time=0.004..2.068 rows=16667 loops=6) Planning Time: 0.196 ms Execution Time: 260.360 ms (13 rows)
四、执行计划差异分析
- Connect by 查询执行逻辑:查询是通过 pid = prior id ,也就是将前条记录的 id 作为值,传给 pid 进行索引扫描。逻辑上可以看做是逐个分支查询,上个分支查询结束,再进行下个分支扫描。loop = 100000,就是表示针对每条记录,都要访问一次索引。
- With recursive 查询逻辑:是按层次查询,上层结果都返回后,再执行下层查询。每层可以根据所有ctid进行排序,也就是 Bitmap Index Scan,将所有ctid都返回,排序,再访问表,效率提高。另外,由于是每层数据返回后,再去关联查找下层数据,可以使用hash join,提升访问效率。 rows=16666 loop = 6,表示需要访问6个批次,每次平均 16666 条记录。
五、Oracle Connect By 查询性能
以下是同样数据量的情况下,Oracle connect by 查询的性能:
SQL> select id,pid,name from test_recursive start with id=1 connect by prior id = pid ; 100000 rows selected. Elapsed: 00:00:00.98 Execution Plan ---------------------------------------------------------- Plan hash value: 2099392185 ---------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12 | 384 | 18 (12)| 00:00:01 | |* 1 | CONNECT BY WITH FILTERING | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED | TEST_RECURSIVE | 1 | 32 | 2 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IND_TEST_RECURSIVE_ID | 1 | | 1 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 11 | 495 | 14 (0)| 00:00:01 | | 5 | CONNECT BY PUMP | | | | | | | 6 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST_RECURSIVE | 11 | 352 | 12 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | IND_TEST_RECURSIVE_PID | 11 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("PID"=PRIOR "ID") 3 - access("ID"=1) 7 - access("connect$_by$_pump$_002"."prior id "="PID") Note ----- - dynamic statistics used: dynamic sampling (level=2) - this is an adaptive plan Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 101983 consistent gets 0 physical reads 0 redo size 2337649 bytes sent via SQL*Net to client 73769 bytes received via SQL*Net from client 6668 SQL*Net roundtrips to/from client 8 sorts (memory) 0 sorts (disk) 100000 rows processed
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
转载请注明出处: https://daima100.com/6081.html