Parallel Execution – 2

作者: 2548d1d6a965 | 来源:发表于2015-11-22 09:55 被阅读38次

Parallel Execution – 2
Parallel Execution
Parallel Execution – 1
Parallel Execution – 4
Parallel Execution – 5
Parallel Execution – 3
Flink总结-Parallel配置
appium server1.7 testNG 并行运行and
聊聊flink的Parallel Execution
parallel并行处理

Since I’m going to write a couple of articles dissecting parallel execution plans, I thought I’d put up a reference post describing the set of tables I used to generate the plan from the previous post, and the query (with serial execution plan) that produced it. The setup is a simple star schema arrangement – which I’ve generated by created by creating three identical tables and then doing a Cartesian join across the three of them.

create table t1
as
select
    rownum      id,
    to_char(rownum) small_vc,
    rpad('x',100)   padding
from
    all_objects
where
    rownum <= 70
;
 
alter table t1
    add constraint t1_pk primary key(id)
;
 
create table t2
as
select
    rownum      id,
    to_char(rownum) small_vc,
    rpad('x',100)   padding
from
    all_objects
where
    rownum <= 70
;
 
alter table t2
    add constraint t2_pk primary key(id)
;
 
create table t3
as
select
    rownum      id,
    to_char(rownum) small_vc,
    rpad('x',100)   padding
from
    all_objects
where
    rownum <= 70
;
 
alter table t3
    add constraint t3_pk primary key(id)
;
 
create table t4
nologging
as
select
    t1.id           id1,
    t2.id           id2,
    t3.id           id3,
    rpad(rownum,10)     small_vc,
    rpad('x',100)       padding
from
    t1, t2, t3
;
 
begin
    dbms_stats.gather_table_stats(
        user,
        't1',
        method_opt => 'for all columns size 1'
    );
 
    dbms_stats.gather_table_stats(
        user,
        't2',
        method_opt => 'for all columns size 1'
    );
 
    dbms_stats.gather_table_stats(
        user,
        't3',
        method_opt => 'for all columns size 1'
    );
 
    dbms_stats.gather_table_stats(
        user,
        't4',
        method_opt => 'for all columns size 1'
    );
end;
/

Since there are 70 rows in each of the “dimension” tables, there are 343,000 rows in the “fact” table.
Here’s the query, with serial execution plan – you may find that you have to apply a few hints to reproduce this plan, so I’ve included the minimum necessary set.

select
    /*+
        gather_plan_statistics
        leading(t4 t1 t2 t3)
        full(t4)
        use_hash(t1) full(t1) swap_join_inputs(t1)
        use_hash(t2) full(t2) swap_join_inputs(t2)
        use_hash(t3) full(t3) swap_join_inputs(t3)
    */
    count(t1.small_vc),
    count(t2.small_vc),
    count(t3.small_vc),
    count(t4.small_vc)
from
    t4,
    t1,
    t2,
    t3
where
    t1.id = t4.id1
and t2.id = t4.id2
and t3.id = t4.id3
and t1.small_vc in (1,2,3)
and t2.small_vc in (1,2,3)
and t3.small_vc in (1,2,3)
;
 
----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      1 |00:00:00.19 |   10731 |  10729 |       |       |          |
|   1 |  SORT AGGREGATE       |      |      1 |      1 |      1 |00:00:00.19 |   10731 |  10729 |       |       |          |
|*  2 |   HASH JOIN           |      |      1 |     26 |     27 |00:00:00.01 |   10731 |  10729 |  2061K|  2061K|  832K (0)|
|*  3 |    TABLE ACCESS FULL  | T3   |      1 |      3 |      3 |00:00:00.01 |       3 |      3 |       |       |          |
|*  4 |    HASH JOIN          |      |      1 |    612 |    630 |00:00:00.01 |   10728 |  10726 |  2061K|  2061K|  959K (0)|
|*  5 |     TABLE ACCESS FULL | T2   |      1 |      3 |      3 |00:00:00.01 |       3 |      3 |       |       |          |
|*  6 |     HASH JOIN         |      |      1 |  14491 |  14700 |00:00:00.03 |   10725 |  10723 |  2061K|  2061K|  980K (0)|
|*  7 |      TABLE ACCESS FULL| T1   |      1 |      3 |      3 |00:00:00.01 |       3 |      3 |       |       |          |
|   8 |      TABLE ACCESS FULL| T4   |      1 |    343K|    343K|00:00:00.32 |   10722 |  10720 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T3"."ID"="T4"."ID3")
   3 - filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3))
   4 - access("T2"."ID"="T4"."ID2")
   5 - filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3))
   6 - access("T1"."ID"="T4"."ID1")
   7 - filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3))

As you can see I’ve pulled this plan from memory after enabling rowsource execution statistics – and the optimizer’s estimates are very good. (Given the way I generated the data and the predicates involved, each hash join retains 3/70ths of the incoming intermediate row set.)

The shape of the plan tells me that Oracle hashed the required rows from t3 into memory, then did the same with t2 and t1 before scanning t4 and probing t1, t2, and t3 in that order to see if a row should be allowed to survive and be counted.

There are various ways I could add weight to this claim, of course – for example flushing the buffer cache and enabling extended trace so that I could see the order of physical disk access; and then enabled event 10104 so that I could see the hash join traces and the order in which they took place. I hope you’ll take my word for it, though, that this is what happens in this serial path.

As I’ve pointed out a few times in the past, when you read the plan carefully you realise that it is following the specified join order (t4, t1, t2, t3) – even though to the casual glance the plan might suggest a join order of (t3, t2, t1, t4). This is an example of “what happens first” and “the join order” being very different from each other.

网友评论

本文标题：Parallel Execution – 2

本文链接：https://www.haomeiwen.com/subject/kkfrhttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Parallel Execution – 2

相关文章