問題現象
一天周末,我還在家happy呢,突然值班群里傳來消息,說一個比較重要的系統出現批量慢的情況,而且聽說之前并不慢,今天開始慢了。當時心里一咯噔,一個是這系統之前出現過數據庫切換影響業務了,還被投訴了。另一個就是這個系統之前沒出現過性能問題,大多是集群類的問題,感覺沒有啥經驗可循。當時就有點方,趕緊讓值班的同事把信息發過來。
問題分析
一、看processlist
大多數mysql dba的習慣應該都是上來先看processlist吧,我也不例外,甭管啥問題或者不知道咋查的時候,上來先一通操作猛如虎,而且不停的刷processlist,旁邊的人就感覺你一直在操作,覺得你很專業。
看下這次的這個processlist,里面的SQL就是一個聯合子查詢的delete,看起來沒啥特殊,按照平常的思路接下來應該看執行計劃了。我突然注意到state的值是preparing,這個還真是沒怎么見過。

簡單查了下這個狀態的意思,大概是說這是sql語句處于查詢優化過程,持續時間較長。
二、看執行計劃
我們看下執行計劃,按照我以往淺薄的思路,key字段都用了主鍵了,肯定沒問題啊。但是仔細看id=2的dependent subquery,感覺挺奇怪,仔細度娘了一下,發現這玩意威力無窮啊。

子查詢的類型是DEPENDENT SUBQUERY,表示這個查詢是子查詢的第一個查詢,外部的查詢會反復去進行這個操作。即在這條語句中,外部查詢結果集(數據量為1771579)的每一條結果都將執行一次子查詢,進行1771597次匹配,若數據量較大的情況,即使加了索引也會使效率低下。
我又從慢日志里查了下以前的執行情況,以前這個語句也執行了,但是沒這么慢,而且慢日志里記錄的exam rows是逐漸增長的,看樣子應該是隨著表的數據量增長,SQL性能是逐漸下降的,可能是之前值班同事并沒有注意,這次快跑不完了才注意到。
三、改寫SQL
對于這種子查詢,可以改寫成聯合delete,優化后通過執行計劃看到查詢變為普通查詢,掃描數據量大幅度降低,且都應用了索引,效率有很大提升。
explain delete test1 from test1 ,test2 where test1.asso_code=test2.prd_code and test2.date_type='2' and test1.real_prd_code='HWYXCYQZQUSD1YB' and (test2.trans_date>20991231 or test2.trans_date <20211020);
+----+-------------+------------+------------+------+-----------------------------------------+-----------------+---------+------------------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+-----------------------------------------+-----------------+---------+------------------------------+------+----------+-------------+
| 1 | SIMPLE | test1 | NULL | ref | PRIMARY,idx_2,idx_1 | idx_1 | 98 | const | 2 | 100.00 | Using index |
| 1 | DELETE | test2 | NULL | ref | PRIMARY,idx_transday | PRIMARY | 67 | const,tbproduct.prd_code | 388 | 36.12 | Using where |
+----+-------------+------------+------------+------+-----------------------------------------+-----------------+---------+------------------------------+------+----------+-------------+
5.7 VS 8.0
這個問題到這算是解決了,我往下做了些實驗,發現同樣的子查詢在做select和delete的時候MySQL的優化方式是不一樣的,而且在5.7和8.0上的表現也是不一樣的。
我們可以看到在5.7上,同樣的子查詢語句,delete走的是dependent subquery,而select被改寫成了join。
mysql5.7:
mysql> explain delete from test1 where id in (select id from test2 where date>'2010-10-10');
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | DELETE | test1 | NULL | ALL | NULL | NULL | NULL | NULL | 523328 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test2 | NULL | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | 33.33 | Using where |
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set (0.01 sec)
mysql> explain select * from test1 where id in (select id from test2 where date>'2010-10-10');
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| 1 | SIMPLE | test2 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 73932 | 33.33 | Using where |
| 1 | SIMPLE | test1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.test2.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
mysql> show warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `test`.`test1`.`id` AS `id`,`test`.`test1`.`name` AS `name`,`test`.`test1`.`address` AS `address` from `test`.`test2` join `test`.`test1` where ((`test`.`test1`.`id` = `test`.`test2`.`id`) and (`test`.`test2`.`date` > '2010-10-10')) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select count(*) from test1;
+----------+
| count(*) |
+----------+
| 524288 |
+----------+
1 row in set (0.07 sec)
mysql> select count(*) from test2;
+----------+
| count(*) |
+----------+
| 73728 |
+----------+
1 row in set (0.01 sec)
我們看到在8.0上不論delete還是select都是走了join的方式,這種方式效率就要高的多,看起來8.0相比于5.7在執行計劃選擇上還是高效了不少。
MySQL 8.0:
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test1 |
| test2 |
+----------------+
2 rows in set (0.00 sec)
mysql> select count(*) from test1;
+----------+
| count(*) |
+----------+
| 524288 |
+----------+
1 row in set (0.03 sec)
mysql> select count(*) from test2;
+----------+
| count(*) |
+----------+
| 73728 |
+----------+
1 row in set (0.00 sec)
mysql> explain delete from test1 where id in (select id from test2 where date>'2010-10-10');
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| 1 | SIMPLE | test2 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 73932 | 33.33 | Using where |
| 1 | DELETE | test1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.test2.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
mysql> show warnings;
+-------+------+---------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | delete from `test`.`test1` where ((`test`.`test1`.`id` = `test`.`test2`.`id`) and (`test`.`test2`.`date` > '2010-10-10')) |
+-------+------+---------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> explain select * from test1 where id in (select id from test2 where date>'2010-10-10');
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
| 1 | SIMPLE | test2 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 73932 | 33.33 | Using where |
| 1 | SIMPLE | test1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.test2.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+-------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `test`.`test1`.`id` AS `id`,`test`.`test1`.`name` AS `name`,`test`.`test1`.`address` AS `address` from `test`.`test2` join `test`.`test1` where ((`test`.`test1`.`id` = `test`.`test2`.`id`) and (`test`.`test2`.`date` > '2010-10-10')) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
總結
1、在MySQL中,尤其是8.0以下,這個dependent subquery還是危害挺大,如果有性能問題,而且SQL的執行計劃中有這個,那么大概率就是他的原因。
2、至于為什么delete和select在優化上有不一樣,這塊可能需要源碼的大佬幫忙解釋下了。




