Bug #61936 STRAIGHT_JOIN has no effect on query with subquery in FROM clause
Submitted: 21 Jul 2011 6:27 Modified: 22 Jul 2011 11:23
Reporter: luo longjiu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1.48, 5.5.13, 5.5.16 OS:Linux (x86_64)
Assigned to: CPU Architecture:Any

[21 Jul 2011 6:27] luo longjiu
Description:
force index hint has no effect on query 1,but not for query 2:

index ind_group_type_time:g_id,type,reply_time,deleted,digest,board_id category,verify

query 1:
explain select t.*                                        
     from (select id                                                
             from bbs_0009 g force index (ind_group_type_time) 
            where g.g_id = 154505                                                   
               and g.deleted = 0                                     
               and (g.verify != 0 OR g.verify IS NULL)               
               and board_id = 50555                                  
               and (g.type = 0 or g.type = 2)                        
             order by g.reply_time desc limit 840, 40) g,       
           bbs_0009 t                                         
     where t.id = g.id; 
query2:
explain select id    from bbs_0009 g force index (ind_group_type_time) 
            where g.g_id = 154505                                                   
               and g.deleted = 0                                     
               and (g.verify != 0 OR g.verify IS NULL)               
               and board_id = 50555                                  
               and (g.type = 0 or g.type = 2)                        
             order by g.reply_time desc limit 840, 40

How to repeat:

query 1: full table scan bbs_thread_0009 ----ALL
root@test 02:06:16>explain select t.*                                        
    ->      from (select id                                                
    ->              from bbs_0009 g force index (ind_group_type_time) 
    ->             where g.g_id = 154505                                                   
    ->                and g.deleted = 0                                     
    ->                and (g.verify != 0 OR g.verify IS NULL)               
    ->                and board_id = 50555                                  
    ->                and (g.type = 0 or g.type = 2)                        
    ->              order by g.reply_time desc limit 840, 40) g,       
    ->            bbs_0009 t                                         
    ->      where t.id = g.id;
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
| id | select_type | table      | type   | possible_keys       | key     | key_len | ref  | rows   | Extra          |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL                | NULL    | NULL    | NULL |      9 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY             | PRIMARY | 8       | g.id |      1 |                |
|  2 | DERIVED     | g          | ALL    | ind_group_type_time | NULL    | NULL    | NULL | 652469 | Using filesort |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
3 rows in set (0.54 sec)

query 2 explain : index ind_group_type_time is used for--->using index 
root@test 02:06:18>explain select id                                                
    ->              from bbs_0009 g force index (ind_group_type_time) 
    ->             where g.g_id = 154505                                                   
    ->                and g.deleted = 0                                     
    ->                and (g.verify != 0 OR g.verify IS NULL)               
    ->                and board_id = 50555                                  
    ->                and (g.type = 0 or g.type = 2)                        
    ->              order by g.reply_time desc limit 840, 40;
+----+-------------+-------+-------+---------------------+---------------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type  | possible_keys       | key                 | key_len | ref  | rows   | Extra                                    |
+----+-------------+-------+-------+---------------------+---------------------+---------+------+--------+------------------------------------------+
|  1 | SIMPLE      | g     | range | ind_group_type_time | ind_group_type_time | 9       | NULL | 326235 | Using where; Using index; Using filesort |
+----+-------------+-------+-------+---------------------+---------------------+---------+------+--------+------------------------------------------+
1 row in set (0.00 sec)
[21 Jul 2011 8:08] Valeriy Kravchuk
It would be nice to get complete table definitions first for the tables used, but in second case you use force index hint in a subquery that is used last when joining with other table, t, on the following condition:

WHERE t.id = g.id

Optimizer decided to read table t first (and that makes sense, as you select t.*, for example) and then has to find rows in table g having specific id. Your index is not useful for this kind of access, so it is not used.
[21 Jul 2011 8:35] luo longjiu
i change the order:
 
 g.id=t.id

     root@bbs_00 04:32:24>explain select t.*                                        
    ->      from (select id                                                
    ->              from bbs_0009 g force index (ind_group_type_time) 
    ->             where g.g_id = 154505                                                   
    ->                and g.deleted = 0                                     
    ->                and (g.verify != 0 OR g.verify IS NULL)               
    ->                and board_id = 50555                                  
    ->                and (g.type = 0 or g.type = 2)                        
    ->              order by g.last_reply_time desc limit 840, 40) g,       
    ->            bbs_0009 t                                         
    ->      where g.id=t.id ; 
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
| id | select_type | table      | type   | possible_keys       | key     | key_len | ref  | rows   | Extra          |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL                | NULL    | NULL    | NULL |      9 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY             | PRIMARY | 8       | g.id |      1 |                |
|  2 | DERIVED     | g          | ALL    | ind_group_type_time | NULL    | NULL    | NULL | 708373 | Using filesort |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
[21 Jul 2011 9:08] Valeriy Kravchuk
Order of table references in WHERE does not matter. Optimizer decided to read table t first, and maybe for a good reason. If you want to force different join order, try this:

explain select t.*                                        
     from (select id                                                
             from bbs_0009 g force index (ind_group_type_time) 
            where g.g_id = 154505                                                   
               and g.deleted = 0                                     
               and (g.verify != 0 OR g.verify IS NULL)               
               and board_id = 50555                                  
               and (g.type = 0 or g.type = 2)                        
             order by g.reply_time desc limit 840, 40) g STRAIGHT_JOIN       
           bbs_0009 t                                         
     where t.id = g.id;
[21 Jul 2011 9:21] luo longjiu
root@bbs_00 05:12:03>explain select t.*                                        
    ->      from (select id                                                
    ->              from bbs_0009 g force index (ind_group_type_time) 
    ->             where g.g_id = 154505                                                   
    ->                and g.deleted = 0                                     
    ->                and (g.verify != 0 OR g.verify IS NULL)               
    ->                and board_id = 50555                                  
    ->                and (g.type = 0 or g.type = 2)                        
    ->              order by g.last_reply_time desc limit 840, 40) g straight_join       
    ->            bbs_0009 t                                         
    ->      where g.id=t.id ; 
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
| id | select_type | table      | type   | possible_keys       | key     | key_len | ref  | rows   | Extra          |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL                | NULL    | NULL    | NULL |      9 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY             | PRIMARY | 8       | g.id |      1 |                |
|  2 | DERIVED     | g          | ALL    | ind_group_type_time | NULL    | NULL    | NULL | 708445 | Using filesort |
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+

i want know why the query do not use the index ind_group_type_time,although the force index hint has been added.
[21 Jul 2011 9:22] luo longjiu
root@bbs_00 05:12:03>explain select t.*                                        
    ->      from (select id                                                
    ->              from bbs_0009 g force index (ind_group_type_time) 
    ->             where g.g_id = 154505                                                   
    ->                and g.deleted = 0                                     
    ->                and (g.verify != 0 OR g.verify IS NULL)               
    ->                and board_id = 50555                                  
    ->                and (g.type = 0 or g.type = 2)                        
    ->              order by g.last_reply_time desc limit 840, 40) g straight_join       
    ->            bbs_0009 t                                         
    ->      where g.id=t.id ; 
+----+-------------+------------+--------+---------------------+---------+---------+------+--------+----------------+
| id | select_type | table      | type   | possible_keys       | key     | key_len | ref  | r
[21 Jul 2011 12:09] Valeriy Kravchuk
Theoretically optimizer should access table before STRAIGHT_JOIN first. If you have a complete test case (with CREATE TABLE statements and INSERTs)  that shows it does NOT do this on recent versions, 5.1.58 or 5.5.14, please, upload it.
[21 Jul 2011 13:30] luo longjiu
root@test 09:28:19>desc test_bbs;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | bigint(20) | NO   | PRI | NULL    | auto_increment | 
| g_id        | int(11)    | YES  | MUL | NULL    |                | 
| type        | tinyint(4) | YES  |     | NULL    |                | 
| replay_time | datetime   | YES  |     | NULL    |                | 
| deleted     | tinyint(4) | YES  |     | NULL    |                | 
+-------------+------------+------+-----+---------+----------------+
5 rows in set (0.01 sec)

root@test 09:28:23>show index from test_bbs;
+----------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table    | Non_unique | Key_name       | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| test_bbs |          0 | PRIMARY        |            1 | id          | A         |           2 |     NULL | NULL   |      | BTREE      |         | 
| test_bbs |          1 | ind_test_bbs_g |            1 | g_id        | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
| test_bbs |          1 | ind_test_bbs_g |            2 | type        | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
| test_bbs |          1 | ind_test_bbs_g |            3 | replay_time | A         |           2 |     NULL | NULL   | YES  | BTREE      |         | 
+----------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
4 rows in set (0.00 sec)

root@test 09:28:28>explain select t.*
    ->   from test_bbs t,
    ->        (select id
    ->           from test_bbs force index(ind_test_bbs_g)
    ->          where g_id = 2
    ->            and type in (0, 1)
    ->          order by replay_time limit 1, 10) g
    ->  where t.id = g.id;
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
| id | select_type | table      | type   | possible_keys  | key     | key_len | ref  | rows | Extra          |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL           | NULL    | NULL    | NULL |    3 |                | 
|  1 | PRIMARY     | t          | eq_ref | PRIMARY        | PRIMARY | 8       | g.id |    1 |                | 
|  2 | DERIVED     | test_bbs   | ALL    | ind_test_bbs_g | NULL    | NULL    | NULL |   14 | Using filesort | 
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
3 rows in set (0.03 sec)
[22 Jul 2011 8:12] Valeriy Kravchuk
I can only repeat my previous request:

I need complete test case (repeatable on 5.1.58 or 5.5.14) that shows how optimizer ignores order of table access assumed by STRAIGHT_JOIN.
[22 Jul 2011 8:32] luo longjiu
i upload the sql file ,dumped by mysqldump
you can test the query:
explain select t.*
       from test_bbs t,
           (select id
              from test_bbs force index(ind_test_bbs_g)
             where g_id = 2
                and type in (0, 1)
             order by replay_time limit 1, 10) g
      where t.id = g.id;
[22 Jul 2011 10:09] Valeriy Kravchuk
And where is that sql file? I do not see it attached to this bug report.
[22 Jul 2011 10:47] luo longjiu
sql file

Attachment: test_bbs.sql (, text), 2.46 KiB.

[22 Jul 2011 11:07] Valeriy Kravchuk
Thank you for additional information and persistence.

Verified that STRAIGHT_JOIN is ignored (and, thus, we can not force table access order and usage of index you wanted to force):

macbook-pro:5.5 openxs$ bin/mysql -uroot test < ~/Downloads/test_bbs.sql 
macbook-pro:5.5 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.5.16-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> explain select t.*
    ->        from test_bbs t,
    ->            (select id
    ->               from test_bbs force index(ind_test_bbs_g)
    ->              where g_id = 2
    ->                 and type in (0, 1)
    ->              order by replay_time limit 1, 10) g
    ->       where t.id = g.id;
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
| id | select_type | table      | type   | possible_keys  | key     | key_len | ref  | rows | Extra          |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL           | NULL    | NULL    | NULL |    3 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY        | PRIMARY | 8       | g.id |    1 |                |
|  2 | DERIVED     | test_bbs   | ALL    | ind_test_bbs_g | NULL    | NULL    | NULL |   14 | Using filesort |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
3 rows in set (0.05 sec)

Let's try to change order of tables in FROM:

mysql> explain select t.*        from (select id               from test_bbs force index(ind_test_bbs_g)              where g_id = 2                 and type in (0, 1)              order by replay_time limit 1, 10) g, test_bbs t       where t.id = g.id;
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
| id | select_type | table      | type   | possible_keys  | key     | key_len | ref  | rows | Extra          |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL           | NULL    | NULL    | NULL |    3 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY        | PRIMARY | 8       | g.id |    1 |                |
|  2 | DERIVED     | test_bbs   | ALL    | ind_test_bbs_g | NULL    | NULL    | NULL |   14 | Using filesort |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
3 rows in set (0.00 sec)

Does not help. Let's try to force join order with STRAIGHT_JOIN:

mysql> explain select t.*        from (select id               from test_bbs force index(ind_test_bbs_g)              where g_id = 2                 and type in (0, 1)              order by replay_time limit 1, 10) g STRAIGHT_JOIN test_bbs t       where t.id = g.id;
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
| id | select_type | table      | type   | possible_keys  | key     | key_len | ref  | rows | Extra          |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL           | NULL    | NULL    | NULL |    3 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY        | PRIMARY | 8       | g.id |    1 |                |
|  2 | DERIVED     | test_bbs   | ALL    | ind_test_bbs_g | NULL    | NULL    | NULL |   14 | Using filesort |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
3 rows in set (0.00 sec)

mysql> explain select g.*        from (select id               from test_bbs force index(ind_test_bbs_g)              where g_id = 2                 and type in (0, 1)              order by replay_time limit 1, 10) g STRAIGHT_JOIN test_bbs t       where t.id = g.id;
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
| id | select_type | table      | type   | possible_keys  | key     | key_len | ref  | rows | Extra          |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL           | NULL    | NULL    | NULL |    3 |                |
|  1 | PRIMARY     | t          | eq_ref | PRIMARY        | PRIMARY | 8       | g.id |    1 | Using index    |
|  2 | DERIVED     | test_bbs   | ALL    | ind_test_bbs_g | NULL    | NULL    | NULL |   14 | Using filesort |
+----+-------------+------------+--------+----------------+---------+---------+------+------+----------------+
3 rows in set (0.00 sec)

No way. I'd say this is a bug, as our manual, http://dev.mysql.com/doc/refman/5.5/en/join.html, clearly says:

"STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order."

I had changed synopsis accordingly.
[22 Jul 2011 11:23] luo longjiu
To Valeriy Kravchuk:

thanks a lot.
谢谢你 ^_^
                     
                  DBA@TAOBAO