MySQL中慢sql问题排查记录

最近线上报警提示一个update操作出现慢sql[2秒左右]，所以对该sql进行分析解决问题：
sql语句为：UPDATE test_order SET nstate = 5 WHERE (nstate = 2 and stime < 1589355360)

一、实验环境

1、MySQL版本

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.18    |
+-----------+
1 row in set (0.01 sec)

2、实验表建表语句

CREATE TABLE `test_order` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'xx',
  `type` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `cid` tinyint(1) unsigned NOT NULL COMMENT 'xx',
  `gid` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `isp` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `isv` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `uid` int(10) unsigned DEFAULT '0' COMMENT '用户UID',
  `ordersn` varchar(20) NOT NULL DEFAULT '' COMMENT 'xx',
  `address` varchar(1200) NOT NULL DEFAULT '' COMMENT 'xx',
  `aid` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `price` decimal(10,2) DEFAULT '0.00' COMMENT 'xx',
  `tprice` decimal(10,2) DEFAULT '0.00' COMMENT 'xx',
  `cprice` decimal(10,2) DEFAULT '0.00' COMMENT 'xx',
  `couid` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `inteid` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `postage` decimal(10,2) DEFAULT '0.00' COMMENT 'xx',
  `intro` varchar(120) DEFAULT '' COMMENT 'xx',
  `state` tinyint(1) unsigned NOT NULL COMMENT 'xx',
  `isold` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `ptime` int(10) unsigned DEFAULT '0' COMMENT 'xx',
  `ptype` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `paysn` varchar(45) NOT NULL COMMENT 'xx',
  `restate` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `issend` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `stime` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `extra` varchar(1024) NOT NULL COMMENT 'xx',
  `ip` varchar(32) NOT NULL DEFAULT '' COMMENT 'xx',
  `srcid` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'xx',
  `isvo` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `utime` int(10) unsigned DEFAULT '0' COMMENT '更新时间',
  `ctime` int(10) unsigned DEFAULT '0' COMMENT '创建时间',
  `ise` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `isch` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `pins` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `trsn` varchar(30) DEFAULT NULL COMMENT 'xx',
  `channel` varchar(16) DEFAULT NULL COMMENT 'xx',
  `ntype` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `nissend` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `nstate` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `ispay` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT 'xx',
  `scores` int(11) NOT NULL DEFAULT '0' COMMENT 'xx',
  `ispush` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'xx',
  `supid` int(11) NOT NULL DEFAULT '0' COMMENT 'xx',
  `specid` int(11) NOT NULL DEFAULT '0' COMMENT 'xx',
  `class` tinyint(1) unsigned NOT NULL DEFAULT '1' COMMENT 'xx',
  PRIMARY KEY (`id`),
  UNIQUE KEY `ordersn_unique` (`ordersn`),
  KEY `uid` (`uid`),
  KEY `ctime_index` (`ctime`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='订单表';

二、分析

一般分析我们遇到的一些慢sql都和索引不合理有关，但是在处理具体问题时，应该根据具体情况具体对待，并不是说你加个索引就ok了。
你需要分析业务，分析表数据，然后制定合理的索引去处理问题。要知道，在线上业务，牵一发而动全身，一定要慎之又慎。
1、表数据
拿到这个报警时，首先我去线上只读备库查询表数据，查到线上表数据大概有157万条，也不算少了。
然后，查看该表结构，看到对stime字段和nstate字段并未建立相关索引，此时可以基本确定，update时由于未使用到索引而进行了全表扫描导致慢查询了。
2、业务代码
这个时候，还是不能着急的去创建索引，我们应该去业务代码查看具体业务问题。
找到相应的业务代码，发现在业务上是一个定时脚本在跑订单状态数据，以更新相应订单状态。
经过分析，对该业务进行相应的优化，对于mysql的update操作将会更加的合理。
将该业务由stime<xxx 改为 stimexxx 即：由时间点前全取，改为时间段内取，每次取当前时间前40分钟内的数据。
并将脚本执行频率改为半小时执行一次，这样就保证业务的准确性，也保证了每次update不会更新大量数据，出现长时间锁表问题。
3、索引
分析完业务，此时就要真正的去关心索引问题了。将update语句改为相应的select结构语句为：
select * from test_order WHERE nstate = 2 and stime >= 1577664001 and stime < 1577665001
优化分为两种：
a、单索引
b、组合索引
我们知道，MySQL中的索引也是一个一个的B+树，虽然索引能极大的提升查询效率，但是每次的更新插入操作，在MySQL底层还需要维护索引树的相应更新，这是耗时耗力的。
所以对业务以及数据量进行分析，能建立单索引解决问题的，我们就不必要建立组合索引了。
对业务来说，是一个脚本更新状态的操作，所以业务相对简单，接下来就根据数据量进行一些实验，以验证索引的创建对性能的影响大小。

三、实验开始

1、数据量为50万左右

mysql> select count(*) from test_order;
+----------+
| count(*) |
+----------+
|   514147 |
+----------+
1 row in set (0.11 sec)

无索引时进行explain：

mysql> explain select * from test_order  WHERE nstate = 2 and stime >= 1577664001 and stime < 1577665001;
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 493543 |     1.11 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

-- 查询耗时：
991 rows in set (0.87 sec)

创建nsate单索引，并explain：

mysql> ALTER TABLE `test_order` ADD INDEX `idx_nstate`(`nstate`) USING BTREE;
Query OK, 0 rows affected (0.91 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from test_order  WHERE nstate = 2 and stime >= 1577664001 and stime < 1577665001;
+----+-------------+----------+------------+------+---------------+------------+---------+-------+--------+----------+-------------+
| id | select_type | table    | partitions | type | possible_keys | key        | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+----------+------------+------+---------------+------------+---------+-------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ref  | idx_nstate    | idx_nstate | 1       | const | 127872 |    11.11 | Using where |
+----+-------------+----------+------------+------+---------------+------------+---------+-------+--------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

-- 查询耗时：
991 rows in set (0.57 sec)

创建stime单索引，并explain：【此时删除掉nstate单索引】

mysql> ALTER TABLE `test_order` ADD INDEX `idx_stime`(`stime`) USING BTREE;

mysql> explain select * from test_order  WHERE nstate = 2 and stime >= 1577664001 and stime < 1577665001;
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------------------------+
| id | select_type | table    | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                                         |
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------------------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_stime     | idx_stime | 4       | NULL | 5967 |    10.00 | Using index condition; Using where; Using MRR |
+----+-------------+----------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------------------------+
1 row in set, 1 warning (0.01 sec)

-- 查询耗时：
991 rows in set (0.04 sec)

创建stime_nstate组合索引，并explain：【删除掉nstaet和stime单索引】

mysql> ALTER TABLE `test_order` ADD INDEX `idx_stime`(`stime`) USING BTREE;

mysql> explain select * from test_order  WHERE nstate = 2 and stime >= 1577664001 and stime < 1577665001;
+----+-------------+----------+------------+-------+------------------+------------------+---------+------+------+----------+----------------------------------+
| id | select_type | table    | partitions | type  | possible_keys    | key              | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+----------+------------+-------+------------------+------------------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_stime_nstate | idx_stime_nstate | 5       | NULL | 5967 |    10.00 | Using index condition; Using MRR |
+----+-------------+----------+------------+-------+------------------+------------------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)

-- 查询耗时：
991 rows in set (0.01 sec)

分析：
1、naste不适合单独创建索引，因为他的离散度太小，建立索引意义不大，很容易导致全表扫描。
2、stime离散度很好，建立单索引能有效提升查询效率。
3、stime_nstate组合索引效率最高，比单stime效率能提升大概3-4倍查询效率。

额外说下Extra列中出现的Using MRR：
MRR全称为：Multi-Range Read【组合范围读】
这是MySQL对回表查询的一种优化方式，简单来说就是将二级索引中查到的主键进行排序，然后到聚簇索引上进行顺序读，也就是说将随机IO变为顺序IO，这样会极大的提高查询效率。

再额外说下key_len列中的数字：
看下字段设置：
nstate为tinyint不为null，占1字节
stime为int不为null，占4字节
其中tinyint和int如果允许为null，则会额外加1字节。
key_len对组合索引的使用情况有很大的参考意义。

接下来继续增加表数据量为100万，160万，220万进行相应的测试：
结果显示
无索引的情况下，where查询越来越慢；
nstate单索引，where查询效率虽有提升，但提升不大，每次扫描的行数据相当大；
stime单索引，where查询效率稳定在0.03秒左右
stime_nstate组合索引，where查询效率稳定在0.01秒左右

当然了，测试的数据量不够多，但是可以简单的得出一个结论就是：
stime单索引和stime_nstate组合索引效率提升明显，且组合索引是单索引效率的3倍左右。

最终我选择了建立单stime索引。单索引已经可以提升效率，且业务中只在定时脚本使用该索引，而且该表结构比较复杂，已有一些索引结构，不再建立组合索引影响更新删除。

四、总结

总的来说，优化一条sql语句需要考虑数据量，具体业务以及具体表结构。
需要注意的一点是，建立索引要充分考虑字段的离散度，建立组合索引要把离散度高的尽量放在前面。
对于业务来说，也可能要配合进行相应的调整以便更好的执行sql语句。
当然了，这个处理还是比较简单的一个sql，没有很大的参考性，就记录下来供大家看看吧。