Bug #114780 lost connection in MySQL NDB cluster
Submitted: 25 Apr 2024 8:48 Modified: 25 Apr 2024 12:15
Reporter: CunDi Fang Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:8.0.35-cluster MySQL Cluster Community S OS:Any (20.04)
Assigned to: MySQL Verification Team CPU Architecture:Any

[25 Apr 2024 8:48] CunDi Fang
Description:
Hello, I found a bug in 8.0.35-cluster version of MYSQL cluster. It will cause the current mysql service to crash.

It's very similar to bug #114773, and I doubt it's the same piece of code that's at fault.This bug looks like a similar bug to bug #114779.

The detail is as follow.

OS version and name:
Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Linux eb1f47b08982 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux

Poc:

```
select
  subq_1.c3 as c0,
  subq_1.c0 as c1,
  case when false then subq_1.c0 else subq_1.c0 end
     as c2
from
  (select
        subq_0.c11 as c0,
        subq_0.c11 as c1,
        subq_0.c6 as c2,
        subq_0.c1 as c3
      from
        mytest90.test0 as ref_0
          right join (select
                ref_1.column8 as c0,
                ref_1.column1 as c1,
                ref_1.column8 as c2,
                ref_1.column2 as c3,
                ref_2.column1 as c4,
                ref_2.column1 as c5,
                ref_1.column8 as c6,
                ref_1.column1 as c7,
                ref_2.column1 as c8,
                ref_2.column1 as c9,
                ref_2.column1 as c10,
                ref_1.column8 as c11,
                ref_2.column1 as c12,
                ref_2.column1 as c13,
                27 as c14,
                ref_1.column6 as c15,
                ref_2.column1 as c16,
                ref_2.column1 as c17,
                ref_1.column8 as c18
              from
                mytest90.test3 as ref_1
                  right join mytest90.test6 as ref_2
                  on (ref_1.column8 = ref_2.column1 )
              where true
              limit 93) as subq_0
          on (false)
      where subq_0.c7 is NULL) as subq_1
where subq_1.c3 is NULL
limit 115;
```
It manifests itself in this way:

```
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/usr/local/mysql-cluster/sock/mysql.sock' (111)
ERROR: 
Can't connect to the server
```

Architecture Information:
'''
[NDBD DEFAULT]
NoOfReplicas =2
DataMemory = 512M
IndexMemory = 64M

[NDB_MGMD]
NodeId=1
hostname =192.172.10.8
datadir =/var/lib/mysql-cluster

[NDBD]
NodeId =2
hostname =192.172.10.9
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =3
hostname =192.172.10.10
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[NDBD]
NodeId =4
hostname =192.172.10.11
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =5
hostname =192.172.10.12
datadir =/usr/local/mysql-cluster/data
NodeGroup=1

[mysqld]
NodeId =6
hostname =192.172.10.9
[mysqld]
NodeId =7
hostname =192.172.10.10
[mysqld]
NodeId =8
hostname =192.172.10.11
[mysqld]
NodeId =9
hostname =192.172.10.12
'''

Attempted and successfully reproduced!

How to repeat:
Importing the database file I give later, and then executing Poc

Suggested fix:
This bug looks like a similar bug to bug #114779.
[25 Apr 2024 8:56] CunDi Fang
There are a surprising number of such similar SELECT statements, and I haven't found anything in common. The shortest one I've found so far is this SELECT statement:

Poc:
```
select  
  subq_0.c0 as c0, 
  ref_2.column4 as c1,
  ref_2.column5 as c2,
  subq_0.c0 as c3,
  (select exec_count from sys.x$statements_with_full_table_scans limit 1 offset 2)
     as c4,
  subq_0.c0 as c5
from
  (select
          ref_1.column7 as c0
        from
          mytest90.test3 as ref_0
            inner join mytest90.test7 as ref_1
            on ((ref_1.column6 is not NULL)
                or (true))
        where cast(coalesce(ref_0.column2,
            cast(null as float)) as float) is NULL
        limit 113) as subq_0
    left join mytest90.test7 as ref_2
    on (false)
where subq_0.c0 is not NULL
limit 97;
```

Because the errors behave so similarly, I'll just write them directly in this comment, instead of opening a separate bug list.
[25 Apr 2024 9:32] CunDi Fang
And this one:

Poc:
```
select
  subq_0.c1 as c0,
  ref_0.column8 as c1,
  subq_0.c0 as c2
from
  mytest90.test3 as ref_0
    right join (select
          3 as c0,
          ref_2.column2 as c1
        from
          mytest90.test8 as ref_1
            inner join mytest90.test7 as ref_2
              left join mytest90.test8 as ref_3
              on (false)
            on (ref_1.column6 is NULL)
        where true
        limit 65) as subq_0
    on (subq_0.c0 is NULL)
where subq_0.c1 is NULL;
```
[25 Apr 2024 12:15] MySQL Verification Team
Please use ndb_error_reporter to collect all the required log files https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-programs-ndb-error-reporter.html 

Duplicate of Bug #114464