king

[oracle/sql]求员工表中每个部门里薪水最高的员工,那种sql最优?

king 批处理 2023-02-27 662浏览 0

开始正题前,先把我的数据库环境列出:

# 类别 版本
1 操作系统 Win10
2 数据库 Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
3 硬件环境 T440p

 

下面进入正题

有个员工表emp如下:

CREATE TABLE emp
(
    id NUMBER not null primary key,
    name NVARCHAR2(60) not null,
    salary NUMBER(6,0) NOT NULL,
    deptid NUMBER(2,0) not null
)

可以采用以下sql来填充数据:

Insert into emp
 select rownum,dbms_random.string('*',dbms_random.value(6,20)),dbms_random.value(0,50000),dbms_random.value(0,10) from dual
 connect by level<=10000
 order by dbms_random.random

可以采取下面sql来得到每个部门的最高薪水额,以便后面的分析(得出数据这是本机的结果,诸位因为随机数的原因一定不会和我一样):

SQL> select max(salary),deptid from emp
  2  group by deptid
  3  order by deptid;

MAX(SALARY)     DEPTID
----------- ----------
      49944          0
      49991          1
      49988          2
      49993          3
      49927          4
      49988          5
      49924          6
      49923          7
      49848          8
      49934          9
      49894         10

已选择11行。

已用时间:  00: 00: 00.01

有下面三种sql都能查询出每个部门薪水最高的员工的结果,它们是:

1.
select a.id,a.name,a.salary,a.deptid from emp a
where salary=(select max(salary) from emp b where a.deptid=b.deptid)
order by a.id

2.
select e1.id,e1.name,e1.salary,e1.deptid from emp e1,(select max(salary) max_sal,deptid from emp
                group by deptid
                ) e2
where e1.deptid=e2.deptid and e1.salary=e2.max_sal
order by e1.id

3.
select id,name,salary,deptid from (select e.*,max(salary) over (partition by deptid) max_sal from emp e)
where salary=max_sal
order by id

 

我分别按执行时间消耗(取第二遍sql结果)和执行计划cost列出了一个对比表格如下:

# sql Time elapsed Cost
1
select a.id,a.name,a.salary,a.deptid from emp a
where salary=(select max(salary) from emp b where a.deptid=b.deptid)
order by a.id
00: 00: 00.03 41
2
select e1.id,e1.name,e1.salary,e1.deptid from emp e1,(select max(salary) max_sal,deptid from emp
                group by deptid
                ) e2
where e1.deptid=e2.deptid and e1.salary=e2.max_sal
order by e1.id
00: 00: 07.92 641
3
select id,name,salary,deptid from (select e.*,max(salary) over (partition by deptid) max_sal from emp e)
where salary=max_sal
order by id
00: 00: 00.01 471

按时间消耗是3胜出,1紧随,2差一大截;按cost是1胜出,3和2差了一个数量级;按从执行感觉来说是1,3最快,体会不出差别,而2有明显的停顿。

我的结论是:因为时间消耗和感觉两者可以互相对证,因此是可信的,但执行计划给出的结论在3的身上与现实有明显差别,只好弃而不取。

这个示例证明,执行计划的cost不能单独拿来说明哪个sql更优,即使两者比较差一个数量级也不可贸然采信,它必须得到耗时和现实运行感觉的印证才行;反而耗时可行度很高,按我的经验可以单独采信。

附:耗时比较:

SQL> select a.id,a.name,a.salary,a.deptid from emp a
  2  where salary=(select max(salary) from emp b where a.deptid=b.deptid)
  3  order by a.id;

        ID NAME
        SALARY     DEPTID
---------- ------------------------------------------------------------------------------------------------------------------------ ---------- ----------
      1073 UGJURPQV
         49993          3
      1356 UPHXQELWTDBLFYRBSHSF
         49991          1
      2946 SGSJBCABNNQXGORWPO
         49924          6
      3111 PQMATSYLQNZR
         49848          8
      3516 CBXGAVDIHITQ
         49944          0
      6218 LPZAQPOKQSJNAMNTOT
         49923          7
      7874 LBQPRRDVXUQS
         49988          5
      9032 OPVFSDKNZ
         49988          2
      9329 XRNKOKCCUORV
         49934          9
      9437 WQDWBTNEKJJYFL
         49894         10
      9979 YLXJXJPRKKBXAQIE
         49927          4

已选择11行。

已用时间:  00: 00: 00.03

SQL> select e1.id,e1.name,e1.salary,e1.deptid from emp e1,(select max(salary) max_sal,deptid from emp
  2  group by deptid
  3  ) e2
  4  where e1.deptid=e2.deptid and e1.salary=e2.max_sal
  5  order by e1.id;

        ID NAME
        SALARY     DEPTID
---------- ------------------------------------------------------------------------------------------------------------------------ ---------- ----------
      1073 UGJURPQV
         49993          3
      1356 UPHXQELWTDBLFYRBSHSF
         49991          1
      2946 SGSJBCABNNQXGORWPO
         49924          6
      3111 PQMATSYLQNZR
         49848          8
      3516 CBXGAVDIHITQ
         49944          0
      6218 LPZAQPOKQSJNAMNTOT
         49923          7
      7874 LBQPRRDVXUQS
         49988          5
      9032 OPVFSDKNZ
         49988          2
      9329 XRNKOKCCUORV
         49934          9
      9437 WQDWBTNEKJJYFL
         49894         10
      9979 YLXJXJPRKKBXAQIE
         49927          4

已选择11行。

已用时间:  00: 00: 07.92

SQL> select id,name,salary,deptid from (select e.*,max(salary) over (partition by deptid) max_sal from emp e)
  2  where salary=max_sal
  3  order by id;

        ID NAME
        SALARY     DEPTID
---------- ------------------------------------------------------------------------------------------------------------------------ ---------- ----------
      1073 UGJURPQV
         49993          3
      1356 UPHXQELWTDBLFYRBSHSF
         49991          1
      2946 SGSJBCABNNQXGORWPO
         49924          6
      3111 PQMATSYLQNZR
         49848          8
      3516 CBXGAVDIHITQ
         49944          0
      6218 LPZAQPOKQSJNAMNTOT
         49923          7
      7874 LBQPRRDVXUQS
         49988          5
      9032 OPVFSDKNZ
         49988          2
      9329 XRNKOKCCUORV
         49934          9
      9437 WQDWBTNEKJJYFL
         49894         10
      9979 YLXJXJPRKKBXAQIE
         49927          4

已选择11行。

已用时间:  00: 00: 00.01

执行计划比较:

SQL> select a.id,a.name,a.salary,a.deptid from emp a
  2  where salary=(select max(salary) from emp b where a.deptid=b.deptid)
  3  order by a.id;
已用时间:  00: 00: 00.00

执行计划
----------------------------------------------------------
Plan hash value: 1231226589

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |   127 |    41   (8)| 00:00:01 |
|   1 |  SORT ORDER BY        |         |     1 |   127 |    41   (8)| 00:00:01 |
|*  2 |   HASH JOIN           |         |     1 |   127 |    40   (5)| 00:00:01 |
|   3 |    VIEW               | VW_SQ_1 |  9121 |   231K|    21  (10)| 00:00:01 |
|   4 |     HASH GROUP BY     |         |  9121 |   231K|    21  (10)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| EMP     |  9121 |   231K|    19   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL  | EMP     |  9121 |   899K|    19   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SALARY"="MAX(SALARY)" AND "A"."DEPTID"="ITEM_1")

Note
-----
   - dynamic sampling used for this statement (level=2)
   
SQL> select e1.id,e1.name,e1.salary,e1.deptid from emp e1,(select max(salary) max_sal,deptid from emp
  2  group by deptid
  3  ) e2
  4  where e1.deptid=e2.deptid and e1.salary=e2.max_sal
  5  order by e1.id;
已用时间:  00: 00: 00.00

执行计划
----------------------------------------------------------
Plan hash value: 962461943

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  7562K|  1002M|   641  (95)| 00:00:08 |
|*  1 |  FILTER              |      |       |       |            |          |
|   2 |   SORT GROUP BY      |      |  7562K|  1002M|   641  (95)| 00:00:08 |
|*  3 |    HASH JOIN         |      |  7562K|  1002M|    92  (59)| 00:00:02 |
|   4 |     TABLE ACCESS FULL| EMP  |  9121 |   231K|    19   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| EMP  |  9121 |  1006K|    19   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("E1"."SALARY"=MAX("SALARY"))
   3 - access("E1"."DEPTID"="DEPTID")

Note
-----
   - dynamic sampling used for this statement (level=2)
   
SQL> select id,name,salary,deptid from (select e.*,max(salary) over (partition by deptid) max_sal from emp e)
  2  where salary=max_sal
  3  order by id;
已用时间:  00: 00: 00.00

执行计划
----------------------------------------------------------
Plan hash value: 3418936035

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  9121 |  1015K|       |   471   (1)| 00:00:06 |
|   1 |  SORT ORDER BY       |      |  9121 |  1015K|  1168K|   471   (1)| 00:00:06 |
|*  2 |   VIEW               |      |  9121 |  1015K|       |   234   (1)| 00:00:03 |
|   3 |    WINDOW SORT       |      |  9121 |   899K|  1056K|   234   (1)| 00:00:03 |
|   4 |     TABLE ACCESS FULL| EMP  |  9121 |   899K|       |    19   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("SALARY"="MAX_SAL")

Note
-----
   - dynamic sampling used for this statement (level=2)

2020年1月19日

参考资料:https://blog.csdn.net/paul_wei2008/article/details/19565509

2020-01-20补记,下面是在oracle12上执行的解释计划,取得第二遍结果,但结论,更让人迷糊了,这再次说明解释计划不能单独采信。

Oracle版本:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production PL/SQL Release 12.2.0.1.0 - Production "CORE 12.2.0.1.0 Production" TNS for Linux: Version 12.2.0.1.0 - Production NLSRTL Version 12.2.0.1.0 - Production #1 EXPLAIN PLAN FOR select a.id,a.name,a.salary,a.deptid from emp a where salary=(select max(salary) from emp b where a.deptid=b.deptid) order by a.id Plan hash value: 1231226589 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 605 | 40 (5)| 00:00:01 | | 1 | SORT ORDER BY | | 11 | 605 | 40 (5)| 00:00:01 | |* 2 | HASH JOIN | | 11 | 605 | 39 (3)| 00:00:01 | | 3 | VIEW | VW_SQ_1 | 11 | 176 | 20 (5)| 00:00:01 | | 4 | HASH GROUP BY | | 11 | 88 | 20 (5)| 00:00:01 | | 5 | TABLE ACCESS FULL| EMP | 10000 | 80000 | 19 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 10000 | 380K| 19 (0)| 00:00:01 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("SALARY"="MAX(SALARY)" AND "A"."DEPTID"="ITEM_1") #2 select e1.id,e1.name,e1.salary,e1.deptid from emp e1,(select max(salary) max_sal,deptid from emp group by deptid ) e2 where e1.deptid=e2.deptid and e1.salary=e2.max_sal order by e1.id Plan hash value: 2003893481 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11 | 605 | 40 (5)| 00:00:01 | | 1 | SORT ORDER BY | | 11 | 605 | 40 (5)| 00:00:01 | |* 2 | HASH JOIN | | 11 | 605 | 39 (3)| 00:00:01 | | 3 | VIEW | | 11 | 176 | 20 (5)| 00:00:01 | | 4 | HASH GROUP BY | | 11 | 88 | 20 (5)| 00:00:01 | | 5 | TABLE ACCESS FULL| EMP | 10000 | 80000 | 19 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 10000 | 380K| 19 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("E1"."DEPTID"="E2"."DEPTID" AND "E1"."SALARY"="E2"."MAX_SAL") #3 select id,name,salary,deptid from (select e.*,max(salary) over (partition by deptid) max_sal from emp e) where salary=max_sal order by id Plan hash value: 3418936035 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 1035K| | 365 (1)| 00:00:01 | | 1 | SORT ORDER BY | | 10000 | 1035K| 1192K| 365 (1)| 00:00:01 | |* 2 | VIEW | | 10000 | 1035K| | 121 (1)| 00:00:01 | | 3 | WINDOW SORT | | 10000 | 380K| 520K| 121 (1)| 00:00:01 | | 4 | TABLE ACCESS FULL| EMP | 10000 | 380K| | 19 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SALARY"="MAX_SAL")

 

继续浏览有关 数据库技术文章/教程 的文章
发表评论