Bug #43 Join does not use indexes in case there is no cardinarity data
Submitted: 27 Jan 2003 18:54 Modified: 3 Oct 2008 9:47
Reporter: Peter Zaitsev (Basic Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:3.23, 4.0 OS:Any (all)
Assigned to: CPU Architecture:Any
Tags: qc

[27 Jan 2003 18:54] Peter Zaitsev
Description:
MySQL does not seems to use index for resolving the join in case it does not have 
cardinarity data, which may lead to bad performance:

In this example MySQL scans 100 rows instead of get ones by index.

mysql> explain select * from a,b,c where t=1 and a=b and b=c;
+-------+------+---------------+------+---------+-------+------+-------------+
| table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+-------+------+---------------+------+---------+-------+------+-------------+
| a     | ref  | a,t           | t    |       5 | const |    1 | Using where |
| b     | ALL  | b             | NULL |    NULL | NULL  |   10 | Using where |
| c     | ALL  | c             | NULL |    NULL | NULL  |   10 | Using where |
+-------+------+---------------+------+---------+-------+------+-------------+
3 rows in set (0.02 sec)

This problem was reported by one of the users who noticed such type of query is faster with table sized 1000 rows than one with 100 rows.   
In this case index is used by full table scan is selected instead.

Even better example is this:

mysql> explain select * from a,b,c where t=1 and a=b and b=c;
+-------+-------+---------------+------+---------+-------+------+--------------------------+
| table | type  | possible_keys | key  | key_len | ref   | rows | Extra                    |
+-------+-------+---------------+------+---------+-------+------+--------------------------+
| a     | ref   | a,t           | t    |       5 | const |    1 | Using where              |
| b     | index | b             | b    |       5 | NULL  |   10 | Using where; Using index |
| c     | index | c             | c    |       5 | NULL  |   10 | Using where; Using index |
+-------+-------+---------------+------+---------+-------+------+--------------------------+
3 rows in set (0.01 sec)

Basically it looks like MySQL decides to do complete table scan instead of using the index if it does not knows the key statistics. I'm not sure if it is good guess for Join.

How to repeat:

CREATE TABLE a (
  a int(11) default NULL,
  t int(11) default NULL,
  KEY a (a),
  KEY t (t)
) TYPE=MyISAM;

--
-- Dumping data for table 'a'
--

INSERT INTO a VALUES (1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8),(9,9),(10,10);

--
-- Table structure for table 'b'
--

CREATE TABLE b (
  b int(11) default NULL,
  x int(11) default NULL,
  y int(11) default NULL,
  KEY b (b)
) TYPE=MyISAM;

--
-- Dumping data for table 'b'
--

INSERT INTO b VALUES (1,NULL,NULL),(2,NULL,NULL),(3,NULL,NULL),(4,NULL,NULL),(5,NULL,NULL),(6,NULL,NULL),(7,NULL,NULL),(8,NULL,NULL),(9,NULL,NULL),(10,NULL,NULL);

--
-- Table structure for table 'c'
--

CREATE TABLE c (
  c int(11) default NULL,
  z int(11) default NULL,
  KEY c (c)
) TYPE=MyISAM;

--
-- Dumping data for table 'c'
--

INSERT INTO c VALUES (1,NULL),(2,NULL),(3,NULL),(4,NULL),(5,NULL),(6,NULL),(7,NULL),(8,NULL),(9,NULL),(10,NULL);

select * from a,b,c where t=1 and a=b and b=c;

Suggested fix:
run analyze on these tables
[28 Jan 2003 17:00] MySQL Developer
Not a bug, but a not critical ptimizer issue, as this is only relevant for small tables (which are fast anyway)

Nothing to be done at this stage
[27 Sep 2008 8:42] Konstantin Osipov
mysql> explain select * from a,b,c where t=1 and a=b and b=c\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: ref
possible_keys: a,t
          key: t
      key_len: 5
          ref: const
         rows: 1
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: b
          key: b
      key_len: 5
          ref: test.a.a
         rows: 2
        Extra: 
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
         type: ref
possible_keys: c
          key: c
      key_len: 5
          ref: test.a.a
         rows: 2
        Extra: 
3 rows in set (0.00 sec)

I believe I can't repeat it any more.
My version is 6.0.

Re-opening to be re-verified, since I'm not sure I'm reading explain output correctly.
[3 Oct 2008 9:47] Valeriy Kravchuk
Not repeatable with recent 6.0.x and 5.1.x.