当前位置:网站首页>【线上死锁分析】由index_merge引发的死锁事件
【线上死锁分析】由index_merge引发的死锁事件
2022-07-26 09:13:00 【new hilbert()】
1.事情背景
背景由于更换新的短信供应商,同事之前可能对这块业务不太熟,原本是回执ID(recordId)一个手机号一个,但是同事接的时候将这个批量发送接口只设置了一个recordId,导致了多个手机号共用了一个recordId。
2.线上deadLock 事件发生
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
- 先执行sql 语句查看执行计划,为什么存在锁竞争
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'xxx' AND record_id = 'xxxx';
- 找DBA同学查找相关的死锁日志
死锁概念: 死锁是指两个或者多个事务在同一资源上相互占用,并请求锁定对方占用的资源,从而导致恶性循环的现象。当多个事务试图以不同的顺序锁定资源时,就可能会产生死锁
线上死锁日志如下: 平常很少开启长事务,并且为了减少死锁概率,线上的隔离级别为RC。(间隙锁只有在rr级别下才有,rc 级别是行锁)
------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-03-23 10:19:11 0x7f8f9bf38700
*** (1) TRANSACTION:
TRANSACTION 2912167906, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 48 lock struct(s), heap size 8400, 4 row lock(s)
MySQL thread id 16079756, OS thread handle 140261855196928, query id 2624577147
sms_user updating
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'aaaaa' AND record_id = 'xxxxx'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 1943341 n bits 96 index PRIMARY of table `sms`.`sms_send` trx id 2912167906 lock_mode X locks rec but not gap waiting
Record lock, heap no 12 PHYSICAL RECORD: n_fields 48; compact format; info bits 0
0: len 8; hex 800000003d238639; asc =# 9;;
1: len 6; hex 0000ad941aeb; asc ;;
2: len 7; hex d8000900200165; asc e;;
3: len 30; hex 36386534343764632d353437372d343339302d613133362d653731363234; asc 68e447dc-5477-4390-a136-e71624; (total 36 bytes);
4: len 30; hex 323637353535633865363164653361383935613735663735633434663732; asc 267555c8e61de3a895a75f75c44f72; (total 32 bytes);
5: len 30; hex 30383733636431302d303166382d346131342d613837342d373634353535; asc 0873cd10-01f8-4a14-a874-764555; (total 45 bytes);
6: len 18; hex 6f70732d776f726b2d6f726465722d737663; asc ops-work-order-svc;;
7: len 16; hex 41454a50385355504b41524f39595850; asc AEJP8SUPKARO9YXP;;
8: len 9; hex e8b4a7e68b89e68b89; asc ;;
9: len 19; hex 54435f32303232303332323134343332335f37; asc TC_20220322144323_7;;
10: len 30; hex 7b22e8b5a0e59381223a22e5a5bde7a4bc222c22e695b0e9878f223a2232; asc {" ":" "," ":"2; (total 56 bytes);
11: len 30; hex e38090e8b4a7e68b89e68b89e38091e5a5bde6b688e681afefbc81e68ea8; asc ; (total 245 bytes);
12: len 11; hex 3135333731373935303939; asc 15371795099;;
13: len 6; hex 6a696e6c6f75; asc jinlou;;
14: len 4; hex 80000000; asc ;;
15: len 0; hex ; asc ;;
16: len 0; hex ; asc ;;
17: len 9; hex e88081e5b8a6e696b0; asc ;;
18: len 2; hex 3336; asc 36;;
19: len 0; hex ; asc ;;
20: len 0; hex ; asc ;;
21: len 6; hex e890a5e99480; asc ;;
22: len 1; hex 34; asc 4;;
23: len 1; hex 33; asc 3;;
24: len 1; hex 32; asc 2;;
25: len 6; hex 686c6c797831; asc hllyx1;;
26: len 4; hex 80000002; asc ;;
27: len 4; hex 80000001; asc ;;
28: len 4; hex 80000000; asc ;;
29: len 0; hex ; asc ;;
30: len 18; hex 6f70732d776f726b2d6f726465722d737663; asc ops-work-order-svc;;
31: len 0; hex ; asc ;;
32: len 0; hex ; asc ;;
33: len 0; hex ; asc ;;
34: len 0; hex ; asc ;;
35: len 0; hex ; asc ;;
36: len 0; hex ; asc ;;
37: len 30; hex e69caae79fa5efbc8ce99c80e8a1a5e58585e99499e8afafe68f8fe8bfb0; asc ; (total 33 bytes);
38: len 6; hex e5908ce59f8e; asc ;;
39: len 6; hex e58fb8e69cba; asc ;;
40: len 0; hex ; asc ;;
41: len 4; hex 813489a3; asc 4 ;;
42: len 5; hex 99ac6ea4c4; asc n ;;
43: len 5; hex 99ac6ea4c4; asc n ;;
44: len 0; hex ; asc ;;
45: len 14; hex 3137373234343734353135343536; asc 17724474515456;;
46: len 0; hex ; asc ;;
47: len 24; hex 373261623566393839373439663331393839333861333236; asc 72ab5f989749f3198938a326;;
*** (2) TRANSACTION:
TRANSACTION 2912167907, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
6 lock struct(s), heap size 1136, 3 row lock(s)
MySQL thread id 16079884, OS thread handle 140254773479168, query id 2624577148
sms_user updating
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'bbbbb' AND record_id = 'xxxxx'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 56 page no 1943341 n bits 96 index PRIMARY of table `sms`.`sms_send` trx id 2912167907 lock_mode X locks rec but not gap
Record lock, heap no 12 PHYSICAL RECORD: n_fields 48; compact format; info bits 0
0: len 8; hex 800000003d238639; asc =# 9;;
1: len 6; hex 0000ad941aeb; asc ;;
2: len 7; hex d8000900200165; asc e;;
3: len 30; hex 36386534343764632d353437372d343339302d613133362d653731363234; asc 68e447dc-5477-4390-a136-e71624; (total 36 bytes);
4: len 30; hex 323637353535633865363164653361383935613735663735633434663732; asc 267555c8e61de3a895a75f75c44f72; (total 32 bytes);
5: len 30; hex 30383733636431302d303166382d346131342d613837342d373634353535; asc 0873cd10-01f8-4a14-a874-764555; (total 45 bytes);
6: len 18; hex 6f70732d776f726b2d6f726465722d737663; asc ops-work-order-svc;;
7: len 16; hex 41454a50385355504b41524f39595850; asc AEJP8SUPKARO9YXP;;
8: len 9; hex e8b4a7e68b89e68b89; asc ;;
9: len 19; hex 54435f32303232303332323134343332335f37; asc TC_20220322144323_7;;
10: len 30; hex 7b22e8b5a0e59381223a22e5a5bde7a4bc222c22e695b0e9878f223a2232; asc {" ":" "," ":"2; (total 56 bytes);
11: len 30; hex e38090e8b4a7e68b89e68b89e38091e5a5bde6b688e681afefbc81e68ea8; asc ; (total 245 bytes);
12: len 11; hex 3135333731373935303939; asc 15371795099;;
13: len 6; hex 6a696e6c6f75; asc jinlou;;
14: len 4; hex 80000000; asc ;;
15: len 0; hex ; asc ;;
16: len 0; hex ; asc ;;
17: len 9; hex e88081e5b8a6e696b0; asc ;;
18: len 2; hex 3336; asc 36;;
19: len 0; hex ; asc ;;
20: len 0; hex ; asc ;;
21: len 6; hex e890a5e99480; asc ;;
22: len 1; hex 34; asc 4;;
23: len 1; hex 33; asc 3;;
24: len 1; hex 32; asc 2;;
25: len 6; hex 686c6c797831; asc hllyx1;;
26: len 4; hex 80000002; asc ;;
27: len 4; hex 80000001; asc ;;
28: len 4; hex 80000000; asc ;;
29: len 0; hex ; asc ;;
30: len 18; hex 6f70732d776f726b2d6f726465722d737663; asc ops-work-order-svc;;
31: len 0; hex ; asc ;;
32: len 0; hex ; asc ;;
33: len 0; hex ; asc ;;
34: len 0; hex ; asc ;;
35: len 0; hex ; asc ;;
36: len 0; hex ; asc ;;
37: len 30; hex e69caae79fa5efbc8ce99c80e8a1a5e58585e99499e8afafe68f8fe8bfb0; asc ; (total 33 bytes);
38: len 6; hex e5908ce59f8e; asc ;;
39: len 6; hex e58fb8e69cba; asc ;;
40: len 0; hex ; asc ;;
41: len 4; hex 813489a3; asc 4 ;;
42: len 5; hex 99ac6ea4c4; asc n ;;
43: len 5; hex 99ac6ea4c4; asc n ;;
44: len 0; hex ; asc ;;
45: len 14; hex 3137373234343734353135343536; asc 17724474515456;;
46: len 0; hex ; asc ;;
47: len 24; hex 373261623566393839373439663331393839333861333236; asc 72ab5f989749f3198938a326;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 56 page no 1307492 n bits 344 index ix_record_id of table `sms`.`sms_send` trx id 2912167907 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 30; hex 30383733636431302d303166382d346131342d613837342d373634353535; asc 0873cd10-01f8-4a14-a874-764555; (total 45 bytes);
1: len 8; hex 800000003d238639; asc =# 9;;
*** WE ROLL BACK TRANSACTION (2)
3. 日志分析
事务1: 执行语句
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'aaaaa' AND record_id = 'xxxxx'
关键信息:
- locked 3 : 表示当前事务持有的行记录锁/gap 锁的个数为3
- WAITING FOR THIS LOCK TO BE GRANTED: 表示事务1 正在等待 record lock的授权,不是间隙锁,因为事务隔离级别已经是RC。
RECORD LOCKS space id 56 page no 1943341 n bits 96 index PRIMARY of table
sms
.sms_send
trx id 2912167906 lock_mode X locks rec but not gap waiting
事务2: 执行语句
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'bbbbb' AND record_id = 'xxxxx'
关键信息:
- locked 3 : 表示当前事务持有的行记录锁/gap 锁的个数为3
- HOLDS THE LOCK(S): 持有事务1现在所需要的锁
RECORD LOCKS space id 56 page no 1943341 n bits 96 index PRIMARY of table
sms
.sms_send
trx id 2912167907 lock_mode X locks rec but not gap
- WAITING FOR THIS LOCK TO BE GRANTED:等待ix_record_id 这个索引的释放。
RECORD LOCKS space id 56 page no 1307492 n bits 344 index ix_record_id of table
sms
.sms_send
trx id 2912167907 lock_mode X locks rec but not gap waiting
- 执行计划分析
重点关注:
- type:index_merge (对同一个表可以使用多个索引分别进行条件扫描)
- key: ix_phone_number, ix_record_id
- extra: Using intersect(ix_phone_number,ix_record_id); Using where (index intersect merge就是多个索引条件扫描得到的结果进行交集运算)
因为有多个索引条件,两个事务会根据索引的区分度高的优先执行。
事务1
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'aaaaa' AND record_id = 'xxxxx'
事务1 优先使用recordId的索引,它的区分度比phoneNumber 的高一些
事务2
UPDATE sms_send SET req_success = 2,err_msg = '状态成功' WHERE send_day = '20220323' AND phone_number = 'bbbbb' AND record_id = 'xxxxx'
事务2 优先使用phoneNumber的索引,它的区分度比recordId 的高一些
InnoDB行锁是通过给索引上的索引项加锁来实现的,非聚簇索引会找到主键索引块,然后对主键索引块进行加锁操作。
- 事务1给主键4 加锁成功之后,准备给主键 1加锁,发现被事务2占有
- 事务2给主键1 2 3 加锁成功之后,准备给xxxx 索引加锁,发现被事务1占有
- 事务1和事务2互相等待造成死锁。MySQL默认开启了死锁检测机制,当检测到死锁后会选择一个最小(锁定资源最少得事务)的事务进行回滚.
*** WE ROLL BACK TRANSACTION (2)
4.解决方案
- 将recordId变成唯一的,减少锁的范围
- 可以先查询出来是否存在,然后根据主键id 去 更新
死锁优化建议
- 尽可能让所有数据检索都通过索引来完成,避免无索引行锁升级为表锁。
- 合理设计索引,尽量缩小锁的范围 尽可能较少检索条件,避免间隙锁
- 尽量控制事务大小,减少锁定资源量和时间长度
- 尽可能低级别事务隔离(因为间隙锁只存在RR,RC不存在)
边栏推荐
- CF1481C Fence Painting
- 李沐d2l(四)---Softmax回归
- 2022年上海市安全员C证考试试题及模拟考试
- Error: cannot find module 'UMI' problem handling
- js闭包:函数和其词法环境的绑定
- Elastic APM安装和使用
- JVM command induction
- 公告 | FISCO BCOS v3.0-rc4发布,新增Max版,可支撑海量交易上链
- JDBC database connection pool (Druid Technology)
- [leetcode database 1050] actors and directors who have cooperated at least three times (simple question)
猜你喜欢
网络安全漫山遍野的高大上名词之后的攻防策略本质
Sending and receiving of C serialport
CF1481C Fence Painting
服务器内存故障预测居然可以这样做!
The Child and Binary Tree-多项式开根求逆
Li Mu D2L (V) -- multilayer perceptron
Li Mu D2L (IV) -- softmax regression
Grain College of all learning source code
CSDN Top1 "how does a Virgo procedural ape" become a blogger with millions of fans through writing?
李沐d2l(四)---Softmax回归
随机推荐
Error: cannot find module 'UMI' problem handling
HBuilderX 运行微信开发者工具 “Fail to open IDE“报错解决
聪明的美食家 C语言
《Datawhale熊猫书》出版了!
Uploading pictures on Alibaba cloud OSS
js闭包:函数和其词法环境的绑定
Pat grade a a1076 forwards on Weibo
语音聊天app源码——钠斯直播系统源码
[leetcode database 1050] actors and directors who have cooperated at least three times (simple question)
Grain College of all learning source code
jvm命令归纳
布隆过滤器
[use of final keyword]
JVM触发minor gc的条件
What is the difference between NFT and digital collections?
C# Serialport的发送和接收
zsh: command not found: nvm
JS closure: binding of functions to their lexical environment
(1) CTS tradefed test framework environment construction
PAT 甲级 A1034 Head of a Gang