Description:
Hello, I found a timeout bug in 8.0.35-cluster version of MYSQL cluster. The detail is as follow.
OS version and name:
Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Linux eb1f47b08982 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 x86_64 x86_64 GNU/Linux
PoC:
'''
I will give additional files to the bug after it has been opened.
'''
GDB Trace:
'''
#0 Join_nest::get_inner_nest () at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_query_plan.cc:971
#1 0x0000000002670031 in pushed_table::get_full_inner_nest () at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_query_plan.cc:1221
#2 0x000000000263fad5 in ndb_pushed_builder_ctx::is_pushable_with_root ()
at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_ndbcluster_push.cc:884
#3 ndb_pushed_builder_ctx::is_pushable_with_root () at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_ndbcluster_push.cc:790
#4 0x0000000002640ac1 in ndb_pushed_builder_ctx::make_pushed_join ()
at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_ndbcluster_push.cc:622
#5 0x0000000002640bcb in ndb_pushed_builder_ctx::make_pushed_join ()
at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_ndbcluster_push.cc:660
#6 0x0000000002623fb4 in ndbcluster_push_to_engine () at ../../../mysql-cluster-gpl-8.0.35/storage/ndb/plugin/ha_ndbcluster.cc:14361
#7 0x0000000000e5ec01 in JOIN::push_to_engines () at ../../mysql-cluster-gpl-8.0.35/sql/sql_optimizer.cc:1148
#8 0x0000000000e78978 in JOIN::optimize () at ../../mysql-cluster-gpl-8.0.35/sql/sql_optimizer.cc:1062
#9 0x0000000000edac91 in Query_block::optimize () at ../../mysql-cluster-gpl-8.0.35/sql/sql_select.cc:2013
#10 0x0000000000f5990d in Query_expression::optimize () at ../../mysql-cluster-gpl-8.0.35/sql/sql_union.cc:1006
#11 0x0000000000ed9bb4 in Sql_cmd_dml::execute_inner () at ../../mysql-cluster-gpl-8.0.35/sql/sql_select.cc:1007
#12 0x0000000000ee4ef4 in Sql_cmd_dml::execute () at ../../mysql-cluster-gpl-8.0.35/sql/sql_select.cc:793
#13 0x0000000000e80bc7 in mysql_execute_command () at ../../mysql-cluster-gpl-8.0.35/sql/sql_parse.cc:4719
#14 0x0000000000e843bb in dispatch_sql_command () at ../../mysql-cluster-gpl-8.0.35/sql/sql_parse.cc:5368
#15 0x0000000000e86d01 in dispatch_command () at ../../mysql-cluster-gpl-8.0.35/sql/sql_parse.cc:2054
#16 0x0000000000e8787b in do_command () at ../../mysql-cluster-gpl-8.0.35/sql/sql_parse.cc:1439
#17 0x0000000000fe09b8 in handle_connection () at ../../mysql-cluster-gpl-8.0.35/sql/conn_handler/connection_handler_per_thread.cc:302
#18 0x0000000002848944 in pfs_spawn_thread () at ../../../mysql-cluster-gpl-8.0.35/storage/perfschema/pfs.cc:3042
#19 0x00007f52a9c40ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#20 0x00007f52a9cd1bf4 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
'''
Architecture Information:
'''
[NDBD DEFAULT]
NoOfReplicas =2
DataMemory = 512M
IndexMemory = 64M
[NDB_MGMD]
NodeId=1
hostname =192.172.10.8
datadir =/var/lib/mysql-cluster
[NDBD]
NodeId =2
hostname =192.172.10.9
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =3
hostname =192.172.10.10
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[NDBD]
NodeId =4
hostname =192.172.10.11
datadir =/usr/local/mysql-cluster/data
NodeGroup=0
[NDBD]
NodeId =5
hostname =192.172.10.12
datadir =/usr/local/mysql-cluster/data
NodeGroup=1
[mysqld]
NodeId =6
hostname =192.172.10.9
[mysqld]
NodeId =7
hostname =192.172.10.10
[mysqld]
NodeId =8
hostname =192.172.10.11
[mysqld]
NodeId =9
hostname =192.172.10.12
'''
Attempted and successfully reproduced!
How to repeat:
Simply execute that PoC to trigger it.
It is rendered in the "show processlist" as follows:
'''
MySQL root@(none):(none)> show processlist
+----+-----------------+-----------+--------+---------+-------+-----------------------------------+-----------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+--------+---------+-------+-----------------------------------+-----------------------------------------------------------------------------------------------------------+
| 2 | system user | | | Daemon | 0 | Waiting for event from ndbcluster | <null> |
| 6 | event_scheduler | localhost | <null> | Daemon | 19680 | Waiting on empty queue | <null> |
| 12 | root | localhost | <null> | Killed | 3907 | preparing | select\n ref_2.column4 as c0,\n ref_8.column3 as c1,\n case when EXISTS (\n select\n sub |
| 17 | root | localhost | <null> | Sleep | 14 | | <null> |
| 18 | root | localhost | <null> | Query | 0 | init | show processlist |
+----+-----------------+-----------+--------+---------+-------+-----------------------------------+-----------------------------------------------------------------------------------------------------------+
5 rows in set
Time: 0.018s
'''
Suggested fix:
It seems to be a problem with the "optimiser" part, which seems like a dead loop, I suggest to check the code of that part. I can provide the database at that time if you need.