用了那么多年MySQL不知道Explain？怪不得性能那么差！

liang1234_ 2019-06-21

展开全文

加个“星标”，不忘签到哦

转载自公众号：yangyidba

关注我，回复口令获取可获取独家整理的学习资料：
- 001 ：领取《Spring Boot基础教程》
- 002 ：领取《Spring Cloud基础教程》
- 华山：领取最新阿里巴巴《Java开发规范1.5》

一简介

性能优化是一个开发或者dba不可少的工作内容，工欲善其事必先利其器，本文介绍一个辅助我们查看sql执行计划是否优化的工具，通过explain的结果，我们可以确定sql是否利用正确的索引。

二介绍

2.1 测试环境

MySQL 5.7

create table a(
id bigint(20) NOT NULL AUTO_INCREMENT,
name varchar(50) NOT NULL DEFAULT '',
age INT(11) DEFAULT 0,
primary key (id),
key idx_name (name)
) engine = innodb default charset= utf8;
insert into a (name, age) values('yy', 11);
insert into a (name, age) values('xx', 25);
insert into a (name, age) values('yz', 23);
insert into a (name, age) values('zhangcan', 32);
insert into a (name, age) values('lisi', 18);
insert into a (name, age) values('boshi', 62);
insert into a (name, age) values('taisen', 52);
insert into a (name, age) values('liuxiang', 32);
insert into a (name, age) values('malong', 23);
insert into a (name, age) values('jingtian', 28);
create table b(
id bigint(20) NOT NULL AUTO_INCREMENT,
sid int not null default 0,
name varchar(50) NOT NULL DEFAULT '',
score INT(11) DEFAULT 0,
primary key (id),
key idx_sid(sid),
key idx_name (name)
) engine = innodb default charset= utf8;
insert into b (sid,name, score) values(1,'yy', 99);
insert into b (sid,name, score) values(1,'yy', 99);
insert into b (sid,name, score) values(1,'yy', 99);
insert into b (sid,name, score) values(2,'xx', 95);
insert into b (sid,name, score) values(2,'xx', 95);
insert into b (sid,name, score) values(3,'yz', 93);
insert into b (sid,name, score) values(3,'yz', 93);
insert into b (sid,name, score) values(4,'zhangcan', 90);
insert into b (sid,name, score) values(5,'lisi', 88);
insert into b (sid,name, score) values(5,'lisi', 80);
insert into b (sid,name, score) values(5,'lisi', 78);
insert into b (sid,name, score) values(6,'boshi', 83);
insert into b (sid,name, score) values(6,'boshi', 80);
insert into b (sid,name, score) values(6,'boshi', 92);
insert into b (sid,name, score) values(7,'taisen', 85);
insert into b (sid,name, score) values(8,'liuxiang', 81);
insert into b (sid,name, score) values(9,'malong', 92);
insert into b (sid,name, score) values(10,'jingtian', 78);
insert into b (sid,name, score) values(10,'jingtian', 90);
insert into b (sid,name, score) values(10,'jingtian', 88);
insert into b (sid,name, score) values(10,'jingtian', 93);

2.2 结果介绍

执行 explian 的结果如下:

test >explain select * from  a where id=3 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

从上面的输出我们可以看到，分别是id、type、tabl、selecttype、possiblekeys、key、key_len、ref、rows、Extra。本文主要以select语句为例讲解 explian的输出。

三解读

3.1 id

查询语句的序号或者说是标识符，每个查询语句包括子查询都会分配一个id,表示查询中执行select子句或者操作表的顺序，可能有如下几种情况

1 id值相同

id 值相同一般出现在多表关联的场景，访问表的顺序是从上到下。

两个id 都为1，先访问b表然后访问a表。

2 id值不同

id 值不同的情况，从大到小执行，值越大越先开始执行或者被访问。

从结果来看，id为2 那一行的子查询先被执行。然后再去访问id=1 中a表。

思考题 如果 a.id in (select sid from b where id=10); explai的结果会是什么样呢？

3 id 包含了相同和不同的情况。

该情况一般是现有2个表或者子查询和表join ，然后在和第三个表关联查询。比如

EXPLAIN SELECT t2.* FROM(SELECT t3.id FROM t3 WHERE t3.other_column = '') s1,t2 WHERE s1.id = t2.id;
---- ------------- ------------ -------- --------------- --------- --------- ------- ------ -------------
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---- ------------- ------------ -------- --------------- --------- --------- ------- ------ -------------
| 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | t2 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 2 | DERIVED | t3 | ALL | NULL | NULL | NULL | NULL | 5 | Using where |
---- ------------- ------------ -------- --------------- --------- --------- ------- ------ -------------

分析结果可看出，先走id最大的2，也就是先走括号里面的查t3表的语句。走完查t3后，顺序执行，有一个，derived是衍生的意思，意思是在执行完t3查询后的s1虚表基础上，中的2，就是id为2的。最后执行的查t2表。

5.7的优化器针对子查询做了很多优化，我自己没有模拟出来场景3，故使用网上的例子。

3.2 select_type（数据读取操作的类型）

常见的有如下6种:SIMPLE、PRIMARY、SUBQUERY、DERIVED、UNION、UNION RESULT，主要是告诉我们查询的类型是普通查询、联合查询、子查询等复杂的查询。

SIMPLE:最简单的查询，查询中不包含子查询或者UNION。

PRIMARY:查询中若包含任何复杂的子查询，最外层查询则被标记为PRIMARY，也就是最后被执行的语句。

SUBQUERY:在SELECT from 或者WHERE列表中包含了子查询

DERIVED:在FROM列表中包含的子查询被标记为DERIVED（衍生）MySQL会递归执行这些子查询，把结果放在临时表里。

UNION:若第二个SELECT出现在UNION之后，则被标记为UNION；若UNION包含在FROM子句的子查询中，外层SELECT将被标记为DERIVED

UNION RESULT:两种UNION语句的合并。

DEPENDENT SUBQUERY: 子查询中的第一个 SELECT, 取决于外面的查询. 即子查询依赖于外层查询的结果. 出现该值的时候一定要特别注意，可能需要使用join的方式优化子查询。

3.3 table（查询涉及的表或衍生表）

其值为表名或者表的别名，表示访问哪一个表，

当from中有子查询的时候，表名是derivedN的形式，其中 N 指向子查询，也就是explain结果中的下一列
当有union result的时候，表名是union 1,2等的形式，1,2表示参与union的query id
注意 MySQL对待这些表和普通表一样，但是这些临时表是没有任何索引的。数据量大的情况下可能会有性能问题。

3.4 type (表示访问表的方式)

从最好到最差的结果依次如下:

system > const > eq_ref > ref > range > index > ALL

system: 表示结果集仅有一行。这是const联接类型的一个特例，表须是myisam或者memory存储引擎。如果是innodb存储引擎，type 显示为 const 。

const: 表示通过主键或者唯一键键查找数据时只匹配最多一行数据。

eq_ref: 该类型多出现在多表join场景，通过主键或者唯一键访问表.

对于前表b的每行记录, 都只能匹配到后表a的一行记录并且查询的比较操作通常是 =,查询效率较高.

ref: 此类型通常出现在sql使用非唯一或非主键索引, 或者是使用最左前缀规则索引的查询. 例如下面这个例子中, 就使用到了 ref 类型的查询:

range: 表示where条件使用索引范围查询, 通过索引字段范围获取表中部分数据记录. 这个类型通常出现在 <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, IN() 操作中.当 type 是 range 时,ref 字段为 NULL。

index: 表示全索引扫描(full index scan)，和 ALL 类型类似，只不过 ALL 类型是全表扫描, 而 index 类型则是扫描所有的索引记录, 而不扫描数据。

index 类型通常会出现在覆盖索引中,所要查询的数据直接在索引中就可以访问，而不用回表扫描数据. 此时Extra 字段会显示 Using index。
还有一种是全表扫描时通过索引顺序访问数据。此时并不会在Extra提示 using index。

ALL: 表示执行计划选择全表扫描，除非数据量极少比如100以内(别抬杠问'101可以吗',遇到过高并发count 1000行数据把数据库堵住的)，当执行计划出现type 为all 时，我们尽量通过修改索引的方式让查询利用索引。

3.5 possible_keys

possible_keys 表示 MySQL 在查询时，能够使用到的索引. 注意，即使有些索引在 possible_keys 中出现，但是并不表示此索引会真正地被 MySQL 使用到. MySQL 在查询时具体使用了哪些索引, 由 key 字段决定。

3.6 key

此字段是 MySQL 在当前查询时所真正使用到的索引。

3.7 key_len

key_len表示执行计划所选择的索引长度有多少字节，通常我们可借此判断联合索引有多少列被选择

在这里 key_len 大小的计算规则是:

一般地，key_len 等于索引列类型字节长度，例如int类型为4 bytes，bigint为8 bytes；
如果是字符串类型，还需要同时考虑字符集因素，例如:CHAR(30) UTF8则key_len至少是90 bytes；
若该列类型定义时允许NULL，其key_len还需要再加 1 bytes;
若该列类型为变长类型，例如 VARCHAR（TEXT\BLOB不允许整列创建索引，如果创建部分索引也被视为动态列类型），其key_len还需要再加 2 bytes;