Description:
Hello, I found a hang bug in 8.0.35-cluster version of MYSQL cluster.
The detail is as follow.
OS version and name:
Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Linux eb1f47b08982 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux
I am executing the following sql statement on a node:
Poc 1:
```
select
ref_0.column4 as c0,
(select Proxied_user from mysql.proxies_priv limit 1 offset 1)
as c1,
ref_0.column2 as c2,
ref_0.column4 as c3,
ref_0.column2 as c4,
ref_0.column1 as c5,
ref_0.column4 as c6,
ref_0.column3 as c7,
ref_0.column2 as c8,
case when (ref_0.column2 is NULL)
or ((EXISTS (
select
ref_1.column1 as c0,
ref_1.column3 as c1
from
mytest90.test8 as ref_1
where ((((76 is NULL)
or (true))
and (true))
or (ref_0.column3 is NULL))
or (EXISTS (
select
ref_1.column4 as c0,
ref_0.column4 as c1,
ref_2.column4 as c2,
ref_1.column6 as c3,
(select first_seen from sys.statements_with_sorting limit 1 offset 4)
as c4,
ref_2.column1 as c5,
ref_0.column3 as c6
from
mytest90.test7 as ref_2
where false
limit 182))
limit 67)) or ((EXISTS (
select
ref_3.column5 as c0,
ref_3.column60 as c1
from
mytest90.test9 as ref_3
where EXISTS (
select
ref_4.column1 as c0,
ref_4.column3 as c1,
ref_4.column3 as c2,
ref_3.column5 as c3,
ref_0.column2 as c4,
(select latency from sys.user_summary_by_file_io_type limit 1 offset 4)
as c5,
ref_3.column4 as c6,
ref_0.column4 as c7
from
mytest90.test1 as ref_4
where ref_3.column60 is not NULL
limit 99)
limit 99))
and (ref_0.column1 is not NULL))) then ref_0.column4 else ref_0.column4 end
as c9
from
mytest90.test1 as ref_0
where ref_0.column2 is NULL
limit 75;
```
At the same time, I'm executing this sql statement at the same time on another node, note that the two sql statements are almost simultaneous:
Poc 2:
```
ALTER TABLE mytest90.test2 ALTER COLUMN column2 DROP DEFAULT;
```
At this point I realized that my Poc 1 finished executing normally, but Poc 2 got stuck and would keep the current mysql window stuck in the main until I forced it to exit, and mysql's own timeout mechanism didn't work to terminate Poc 2.
By this point the HANG has occurred, but it's not over yet. When I try to execute the following sql statement after the node that just executed Poc 1 has finished executing, the stuck master also occurs and the mysql window gets stuck, and I can't deal with the problem on my own until I force quit.
Poc 3:
```
delete from mytest90.test2
where
EXISTS (
select
ref_0.column8 as c0,
mytest90.test2.column66 as c1
from
mytest90.test9 as ref_0 right join mytest90.test8 as ref_1 on (true) left join mytest90.test5 as ref_2
right join mytest90.test7 as ref_3
on (ref_2.column2 = ref_3.column1 )
right join mytest90.test3 as ref_4
on (ref_3.column1 is not NULL)
inner join mytest90.test5 as ref_5
inner join mytest90.test6 as ref_6
on (true)
on (ref_2.column2 = ref_5.column2 )
inner join mytest90.test7 as ref_7
inner join mytest90.test0 as ref_8
on (ref_7.column5 = ref_8.column3 )
on ((mytest90.test2.column2 is NULL)
and (((true)
and (ref_2.column2 is NULL))
or (false)))
on ((ref_0.column8 is not NULL) or (mytest90.test2.column3 is not NULL))
where (true)
or (((mytest90.test2.column3 is NULL)
and (((EXISTS (
select
mytest90.test2.column66 as c0
from
mytest90.test8 as ref_9
where (select HOST from mysql.global_grants limit 1 offset 4)
is NULL))
or (true))
or (true)))
and ((false)
and ((EXISTS (
select
ref_6.column1 as c0,
ref_4.column2 as c1,
ref_0.column6 as c2,
ref_1.column3 as c3,
mytest90.test2.column1 as c4
from
mytest90.test5 as ref_10
where (false)
and (true)))
and (false))))
limit 97);
```
Architecture Information:
'''
[NDBD DEFAULT]
NoOfReplicas =2
DataMemory = 512M
IndexMemory = 64M
[NDB_MGMD]
NodeId=1
hostname =192.172.10.8
datadir =/var/lib/mysql-cluster
[NDBD]
NodeId =2
hostname =192.172.10.9
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =3
hostname =192.172.10.10
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[NDBD]
NodeId =4
hostname =192.172.10.11
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =5
hostname =192.172.10.12
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[mysqld]
NodeId =6
hostname =192.172.10.9
[mysqld]
NodeId =7
hostname =192.172.10.10
[mysqld]
NodeId =8
hostname =192.172.10.11
[mysqld]
NodeId =9
hostname =192.172.10.12
'''
Attempted and successfully reproduced!
How to repeat:
Simply execute that PoC to repeat it.
Suggested fix:
seems like a dead loop