Description:
EXPLAIN shows incorrect rows estimation for federated table (if "ref" access path is used).
Let's create table as follows:
mysql> CREATE TABLE `test` (
-> `id` int(11) NOT NULL auto_increment,
-> `j` int(11) NOT NULL,
-> PRIMARY KEY (`id`),
-> KEY `j` (`j`)
-> ) ENGINE=MyISAM AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.23 sec)
mysql> insert into test values(null, 1);
Query OK, 1 row affected (0.20 sec)
mysql> insert into test values(null, 2);
Query OK, 1 row affected (0.01 sec)
mysql> insert into test values(null, 3);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test values(null, 4);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test values(null, 5);
Query OK, 1 row affected (0.01 sec)
mysql> select * from test where j=2;
+----+---+
| id | j |
+----+---+
| 10 | 2 |
+----+---+
1 row in set (0.10 sec)
mysql> explain select * from test where j=2;
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | row
s | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
| 1 | SIMPLE | test | ref | j | j | 4 | const |
1 | |
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
1 row in set (0.00 sec)
We have correct estimation of rows for the query. Now, let's create FEDERATED table pointing to this one:
mysql> CREATE TABLE `testf` ( `id` int(11) NOT NULL auto_increment, `j` int(11)
NOT NULL, PRIMARY KEY (`id`), KEY `j` (`j`) ) ENGINE=FEDERATED DEFAULT CHARSET
=latin1 CONNECTION='mysql://root:root@192.168.0.1:3307/test/test';
Query OK, 0 rows affected (0.02 sec)
mysql> select * from testf;
+----+---+
| id | j |
+----+---+
| 9 | 1 |
| 10 | 2 |
| 11 | 3 |
| 12 | 4 |
| 13 | 5 |
+----+---+
5 rows in set (0.06 sec)
mysql> select * from testf where j=2;
+----+---+
| id | j |
+----+---+
| 10 | 2 |
+----+---+
1 row in set (0.02 sec)
So, data are the same, as expected, but:
mysql> explain select * from testf where j=2;
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | row
s | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
| 1 | SIMPLE | testf | ref | j | j | 4 | const |
2 | |
+----+-------------+-------+------+---------------+------+---------+-------+----
--+-------+
1 row in set (0.01 sec)
EXPLAIN now estimates number of rows as 2. For "range" estimation is correct:
mysql> explain select * from testf where j<>2;
+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | row
s | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------------+
| 1 | SIMPLE | testf | range | j | j | 4 | NULL |
4 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------------+
1 row in set (0.02 sec)
mysql> select * from testf where j<>2;
+----+---+
| id | j |
+----+---+
| 9 | 1 |
| 11 | 3 |
| 12 | 4 |
| 13 | 5 |
+----+---+
4 rows in set (0.02 sec)
And, because of the following:
mysql> analyze table testf;
+------------+---------+----------+---------------------------------------------
-------------+
| Table | Op | Msg_type | Msg_text
|
+------------+---------+----------+---------------------------------------------
-------------+
| test.testf | analyze | note | The storage engine for the table doesn't sup
port analyze |
+------------+---------+----------+---------------------------------------------
-------------+
1 row in set (0.01 sec)
there is not way to influence this incorrect estimation in case of "ref".
How to repeat:
On one server:
drop table test
CREATE TABLE `test` (
`id` int(11) NOT NULL auto_increment,
`j` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `j` (`j`)
) ENGINE=MyISAM AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;
insert into test values(null, 1);
insert into test values(null, 2);
insert into test values(null, 3);
insert into test values(null, 4);
insert into test values(null, 5);
select * from test where j=2;
explain select * from test where j=2;
On the other:
CREATE TABLE `testf` (
`id` int(11) NOT NULL auto_increment,
`j` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `j` (`j`) )
ENGINE=FEDERATED DEFAULT CHARSET = latin1
CONNECTION='mysql://root:root@192.168.0.1:3307/test/test';
select * from testf where j=2;
explain select * from testf where j=2;
Suggested fix:
Get EXPLAIN estimations from remote server? Or, better, give a way to get and store proper statistics in local data dictionary :)