Bug #114758 hang bug in mysql NDB cluster
Submitted: 24 Apr 2024 8:24 Modified: 25 Apr 2024 1:41
Reporter: CunDi Fang Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:8.0.35-cluster MySQL Cluster Community S OS:Any (20.04)
Assigned to: MySQL Verification Team CPU Architecture:Any

[24 Apr 2024 8:24] CunDi Fang
Description:
Hello, I found a hang bug in 8.0.35-cluster version of MYSQL cluster. 
The detail is as follow.

OS version and name:
Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Linux eb1f47b08982 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux

I am executing the following sql statement on a node:

Poc 1:
```
select
  ref_0.column4 as c0,
  (select Proxied_user from mysql.proxies_priv limit 1 offset 1)
     as c1,
  ref_0.column2 as c2,
  ref_0.column4 as c3,
  ref_0.column2 as c4,
  ref_0.column1 as c5,
  ref_0.column4 as c6,
  ref_0.column3 as c7,
  ref_0.column2 as c8,
  case when (ref_0.column2 is NULL)
      or ((EXISTS (
          select
              ref_1.column1 as c0,
              ref_1.column3 as c1
            from
              mytest90.test8 as ref_1
            where ((((76 is NULL)
                    or (true))
                  and (true))
                or (ref_0.column3 is NULL))
              or (EXISTS (
                select
                    ref_1.column4 as c0,
                    ref_0.column4 as c1,
                    ref_2.column4 as c2,
                    ref_1.column6 as c3,
                    (select first_seen from sys.statements_with_sorting limit 1 offset 4)
                       as c4,
                    ref_2.column1 as c5,
                    ref_0.column3 as c6
                  from
                    mytest90.test7 as ref_2
                  where false
                  limit 182))
            limit 67))        or ((EXISTS (
            select  
                ref_3.column5 as c0, 
                ref_3.column60 as c1
              from
                mytest90.test9 as ref_3
              where EXISTS (
                select
                    ref_4.column1 as c0,
                    ref_4.column3 as c1,
                    ref_4.column3 as c2,
                    ref_3.column5 as c3,
                    ref_0.column2 as c4,
                    (select latency from sys.user_summary_by_file_io_type limit 1 offset 4)
                       as c5,
                    ref_3.column4 as c6,
                    ref_0.column4 as c7
                  from
                    mytest90.test1 as ref_4
                  where ref_3.column60 is not NULL
                  limit 99)
              limit 99))
          and (ref_0.column1 is not NULL))) then ref_0.column4 else ref_0.column4 end
     as c9
from
  mytest90.test1 as ref_0
where ref_0.column2 is NULL
limit 75;
```

At the same time, I'm executing this sql statement at the same time on another node, note that the two sql statements are almost simultaneous:

Poc 2:
```
ALTER TABLE mytest90.test2 ALTER COLUMN column2 DROP DEFAULT;
```

At this point I realized that my Poc 1 finished executing normally, but Poc 2 got stuck and would keep the current mysql window stuck in the main until I forced it to exit, and mysql's own timeout mechanism didn't work to terminate Poc 2.

By this point the HANG has occurred, but it's not over yet. When I try to execute the following sql statement after the node that just executed Poc 1 has finished executing, the stuck master also occurs and the mysql window gets stuck, and I can't deal with the problem on my own until I force quit.

Poc 3:
```
delete from mytest90.test2
where
EXISTS (
  select
      ref_0.column8 as c0,
      mytest90.test2.column66 as c1
    from
      mytest90.test9 as ref_0          right join mytest90.test8 as ref_1          on (true)        left join mytest90.test5 as ref_2
                right join mytest90.test7 as ref_3
                on (ref_2.column2 = ref_3.column1 )
              right join mytest90.test3 as ref_4
              on (ref_3.column1 is not NULL)
            inner join mytest90.test5 as ref_5
              inner join mytest90.test6 as ref_6
              on (true)
            on (ref_2.column2 = ref_5.column2 )
          inner join mytest90.test7 as ref_7
            inner join mytest90.test0 as ref_8
            on (ref_7.column5 = ref_8.column3 )
          on ((mytest90.test2.column2 is NULL)
              and (((true)
                  and (ref_2.column2 is NULL))
                or (false)))
        on ((ref_0.column8 is not NULL)            or (mytest90.test2.column3 is not NULL))
    where (true)
      or (((mytest90.test2.column3 is NULL)
          and (((EXISTS (
                select
                    mytest90.test2.column66 as c0
                  from
                    mytest90.test8 as ref_9
                  where (select HOST from mysql.global_grants limit 1 offset 4)
                       is NULL))
              or (true))
            or (true)))
        and ((false)
          and ((EXISTS (
              select
                  ref_6.column1 as c0,
                  ref_4.column2 as c1,
                  ref_0.column6 as c2,
                  ref_1.column3 as c3,
                  mytest90.test2.column1 as c4
                from
                  mytest90.test5 as ref_10
                where (false)
                  and (true)))
            and (false))))
    limit 97);
```

Architecture Information:
'''
[NDBD DEFAULT]
NoOfReplicas =2
DataMemory = 512M
IndexMemory = 64M

[NDB_MGMD]
NodeId=1
hostname =192.172.10.8
datadir =/var/lib/mysql-cluster

[NDBD]
NodeId =2
hostname =192.172.10.9
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =3
hostname =192.172.10.10
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[NDBD]
NodeId =4
hostname =192.172.10.11
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =5
hostname =192.172.10.12
datadir =/usr/local/mysql-cluster/data
NodeGroup=1

[mysqld]
NodeId =6
hostname =192.172.10.9
[mysqld]
NodeId =7
hostname =192.172.10.10
[mysqld]
NodeId =8
hostname =192.172.10.11
[mysqld]
NodeId =9
hostname =192.172.10.12
'''

Attempted and successfully reproduced!

How to repeat:
Simply execute that PoC to repeat it.

Suggested fix:
seems like a dead loop
[24 Apr 2024 11:31] MySQL Verification Team
Duplicate of Bug #114464
[25 Apr 2024 1:41] CunDi Fang
Thanks for your reply. But I feel that he is not the same as bug #114464? Because he is not stuck because of the select statement, his select statement actually executes normally, it's the alter statement and the deletion in the back order that are stuck. is the root cause of this bug the same as bug #114464?