Bug #44760 Cardinality is wrong on key in ndbcluster table
Submitted: 9 May 2009 9:06
Reporter: Bogdan Kecman Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Triage: Triaged: D2 (Serious) / R6 (Needs Assessment) / E6 (Needs Assessment)

[9 May 2009 9:06] Bogdan Kecman
Description:
Cardinality is wrong on the NDBCLUSTER table.

mysql> select distinct(a) from t1;
+------+
| a    |
+------+
|    1 | 
|    3 | 
|    2 | 
+------+
3 rows in set (0.09 sec)

mysql> select distinct(a) from t2;
+------+
| a    |
+------+
|    1 | 
|    2 | 
|    3 | 
+------+
3 rows in set (0.02 sec)

mysql> 

mysql> select distinct(b) from t1;
+------+
| b    |
+------+
| XXX  | 
| YYY  | 
+------+
2 rows in set (0.11 sec)

mysql> select distinct(b) from t2;
+------+
| b    |
+------+
| XXX  | 
| YYY  | 
+------+
2 rows in set (0.00 sec)

mysql> analyze table t1;
+------------+---------+----------+----------+
| Table      | Op      | Msg_type | Msg_text |
+------------+---------+----------+----------+
| TEST_DB.t1 | analyze | status   | OK       | 
+------------+---------+----------+----------+
1 row in set (0.00 sec)

mysql> analyze table t2;
+------------+---------+----------+----------+
| Table      | Op      | Msg_type | Msg_text |
+------------+---------+----------+----------+
| TEST_DB.t2 | analyze | status   | OK       | 
+------------+---------+----------+----------+
1 row in set (0.02 sec)

mysql> optimize table t1;
+------------+----------+----------+----------+
| Table      | Op       | Msg_type | Msg_text |
+------------+----------+----------+----------+
| TEST_DB.t1 | optimize | status   | OK       | 
+------------+----------+----------+----------+
1 row in set (0.00 sec)

mysql> optimize table t2;
+------------+----------+----------+----------+
| Table      | Op       | Msg_type | Msg_text |
+------------+----------+----------+----------+
| TEST_DB.t2 | optimize | status   | OK       | 
+------------+----------+----------+----------+
1 row in set (0.01 sec)

mysql> show indexes from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| t1    |          0 | PRIMARY  |            1 | id          | A         |       43784 |     NULL | NULL   |      | BTREE      |         | 
| t1    |          1 | i_a      |            1 | a           | A         |       43784 |     NULL | NULL   | YES  | BTREE      |         | 
| t1    |          1 | i_b      |            1 | b           | A         |       43784 |     NULL | NULL   | YES  | BTREE      |         | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)

mysql> show indexes from t2;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| t2    |          0 | PRIMARY  |            1 | id          | A         |       70844 |     NULL | NULL   |      | BTREE      |         | 
| t2    |          1 | i_a      |            1 | a           | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
| t2    |          1 | i_b      |            1 | b           | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)

mysql> 

mysql> explain select distinct(a) from t1;
+----+-------------+-------+------+---------------+------+---------+------+-------+-----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra           |
+----+-------------+-------+------+---------------+------+---------+------+-------+-----------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 43784 | Using temporary | 
+----+-------------+-------+------+---------------+------+---------+------+-------+-----------------+
1 row in set (0.00 sec)

mysql> explain select distinct(a) from t2;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t2    | range | NULL          | i_a  | 5       | NULL |    3 | Using index for group-by | 
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> show global variables like 'version%';
+-------------------------+-----------------------------+
| Variable_name           | Value                       |
+-------------------------+-----------------------------+
| version                 | 5.1.32-ndb-7.0.4-innodb-log | 
| version_comment         | Source distribution         | 
| version_compile_machine | x86_64                      | 
| version_compile_os      | redhat-linux-gnu            | 
+-------------------------+-----------------------------+
4 rows in set (0.00 sec)

same table - innodb:
mysql> show indexes from t2;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| t2    |          0 | PRIMARY  |            1 | id          | A         |       70957 |     NULL | NULL   |      | BTREE      |         | 
| t2    |          1 | i_a      |            1 | a           | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
| t2    |          1 | i_b      |            1 | b           | A         |           8 |     NULL | NULL   | YES  | BTREE      |         | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)

How to repeat:
create table t1 (id int primary key auto_increment, a int, b char(50), key i_a (a), key i_b (b)) engine=ndbcluster;
create table t2 (id int primary key auto_increment, a int, b char(50), key i_a (a), key i_b (b)) engine=myisam;
insert into t1 (a,b) values (1, 'XXX'), (2, 'XXX'), (3, 'YYY'), (3, 'YYY');
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;
insert into t1 (a,b) select a,b from t2;
insert into t2 (a,b) select a,b from t1;

select count(*) from t1;
select count(*) from t2;
select distinct(a) from t1;
select distinct(a) from t2;
select distinct(b) from t1;
select distinct(b) from t2;

analyze table t1;
analyze table t2;

optimize table t1; -- not really needed
optimize table t2; -- not really needed

show indexes from t1;
show indexes from t2;

explain select distinct(a) from t1;
explain select distinct(a) from t2;

Suggested fix:
.
[11 May 2009 13:11] Jonathan Miller
& 7.0
[14 May 2009 20:45] Ian Rogers
The behaviour in the bug is also in this version:

mysql> show variables like '%version%';
+-------------------------+----------------------------------+
| Variable_name           | Value                            |
+-------------------------+----------------------------------+
| protocol_version        | 10                               |
| version                 | 5.1.32-ndb-7.0.5-cluster-gpl-log |
| version_comment         | MySQL Cluster Server (GPL)       |
| version_compile_machine | x86_64                           |
| version_compile_os      | unknown-linux-gnu                |
+-------------------------+----------------------------------+

and means that indexes aren't used:

mysql> explain select t2.b from t1, t2 where t1.b = 'XXX' and t1.a = t2.a limit 1;
+----+-------------+-------+------+---------------+------+---------+-----------+-------+-----------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref       | rows  | Extra                             |
+----+-------------+-------+------+---------------+------+---------+-----------+-------+-----------------------------------+
|  1 | SIMPLE      | t1    | ref  | i_a,i_b       | i_b  | 51      | const     |    10 | Using where with pushed condition |
|  1 | SIMPLE      | t2    | ref  | i_a           | i_a  | 5       | test.t1.a | 23615 | Using where                       |
+----+-------------+-------+------+---------------+------+---------+-----------+-------+-----------------------------------+

(the Extra column should say "Using index" not "Using where"). This is a disaster performance wise!
[3 Apr 2010 12:36] Sveta Smirnova
Bug #52549 was marked as duplicate of this one.
[3 Apr 2010 12:38] Sveta Smirnova
See also bug #8725
[9 Apr 2010 22:52] Tom Farvour
We have the same bug. Why hasn't there been a fix released for this? It's been almost a year!

How can people be using NDBCLUSTER with such a nasty query optimizer bug?

See bug#52597
[12 Apr 2010 5:28] Sveta Smirnova
Bug #52597 was marked as duplicate of this one.
[14 Apr 2010 19:39] Tom Farvour
Anyone going to assess this bug? 7.1 was released and it's still not fixed there either. This is really a bad bug that needs to be addressed.

It's almost an entire year since this bug has had any type of promising updates made to it. It's not like it takes a whole lot to test-case it, it happens on even a freshly setup cluster with brand new tables. Clearly the cardinality is not being calculated correctly which causes random query paths in the optimizer.