Bug #43368 STRAIGHT_JOIN doesn't work for nested joins
Submitted: 4 Mar 2009 11:56 Modified: 10 Feb 2012 3:11
Reporter: Jaka Jančar Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.30, 5.1.34-bzr OS:Any
Assigned to: CPU Architecture:Any

[4 Mar 2009 11:56] Jaka Jančar
Description:
STRAIGHT_JOIN in JOIN clause (as opposed to after SELECT) doesn't work for nested joins.

Query:

SELECT * FROM
    involvedParties
STRAIGHT_JOIN (
    calendarEntries
    LEFT JOIN events ON events.calendarEntryId = calendarEntries.id
    LEFT JOIN todos ON todos.calendarEntryId = calendarEntries.id
) ON calendarEntries.id = involvedParties.calendarEntryId;

Should join tables in order: involvedParties, calendarEntries, events, todos. Instead it joins calendarEntries first:
+----+-------------+-----------------+--------+----------------------------------------------+----------------------------------------------+---------+--------------------------------------+------+-------------+
| id | select_type | table           | type   | possible_keys                                | key                                          | key_len | ref                                  | rows | Extra       |
+----+-------------+-----------------+--------+----------------------------------------------+----------------------------------------------+---------+--------------------------------------+------+-------------+
|  1 | SIMPLE      | calendarEntries | index  | PRIMARY                                      | PRIMARY                                      | 4       | NULL                                 |   47 | Using index | 
|  1 | SIMPLE      | events          | eq_ref | events_calendarEntryId_uk                    | events_calendarEntryId_uk                    | 4       | iadmin_instance_1.calendarEntries.id |    1 |             | 
|  1 | SIMPLE      | todos           | eq_ref | todos_calendarEntryId_uk                     | todos_calendarEntryId_uk                     | 4       | iadmin_instance_1.calendarEntries.id |    1 |             | 
|  1 | SIMPLE      | involvedParties | ref    | involvedParties_calendarEntryId_listOrder_uk | involvedParties_calendarEntryId_listOrder_uk | 4       | iadmin_instance_1.calendarEntries.id |    1 |             | 
+----+-------------+-----------------+--------+----------------------------------------------+----------------------------------------------+---------+--------------------------------------+------+-------------+

Using STRAIGHT_JOIN after the SELECT works, but has the negative side effect that ALL joins are then performed in the specified order, instead of just those explicitly forced to:

SELECT STRAIGHT_JOIN * FROM
    involvedParties
JOIN (
    calendarEntries
    LEFT JOIN events ON events.calendarEntryId = calendarEntries.id
    LEFT JOIN todos ON todos.calendarEntryId = calendarEntries.id
) ON calendarEntries.id = involvedParties.calendarEntryId;

+----+-------------+-----------------+--------+----------------------------------------------+---------------------------+---------+---------------------------------------------------+------+-------------+
| id | select_type | table           | type   | possible_keys                                | key                       | key_len | ref                                               | rows | Extra       |
+----+-------------+-----------------+--------+----------------------------------------------+---------------------------+---------+---------------------------------------------------+------+-------------+
|  1 | SIMPLE      | involvedParties | ALL    | involvedParties_calendarEntryId_listOrder_uk | NULL                      | NULL    | NULL                                              |  116 |             | 
|  1 | SIMPLE      | calendarEntries | eq_ref | PRIMARY                                      | PRIMARY                   | 4       | iadmin_instance_1.involvedParties.calendarEntryId |    1 | Using index | 
|  1 | SIMPLE      | events          | eq_ref | events_calendarEntryId_uk                    | events_calendarEntryId_uk | 4       | iadmin_instance_1.calendarEntries.id              |    1 |             | 
|  1 | SIMPLE      | todos           | eq_ref | todos_calendarEntryId_uk                     | todos_calendarEntryId_uk  | 4       | iadmin_instance_1.calendarEntries.id              |    1 |             | 
+----+-------------+-----------------+--------+----------------------------------------------+---------------------------+---------+---------------------------------------------------+------+-------------+

How to repeat:
I can't provide full sample tables that would reproduce the case, because I don't know the optimizer well enough to trick it into reordering joins so that not working of STRAIGHT_JOIN would be apparent.
[4 Mar 2009 14:52] Valeriy Kravchuk
Thank you for the problem report. Please, send the results of SHOW CREATE TABLE and SHOW TABLE STATUS for the tables used in your report.
[4 Mar 2009 16:14] Jaka Jančar
CREATE TABLE `involvedParties` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `calendarEntryId` int(11) NOT NULL,
  `listOrder` int(11) NOT NULL,
  `personId` int(11) DEFAULT NULL,
  `companyId` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `involvedParties_calendarEntryId_listOrder_uk` (`calendarEntryId`,`listOrder`),
  UNIQUE KEY `involvedParties_personId_calendarEntryId_uk` (`personId`,`calendarEntryId`),
  UNIQUE KEY `involvedParties_companyId_calendarEntryId_uk` (`companyId`,`calendarEntryId`),
  CONSTRAINT `involvedParties_calendarEntryId_fk` FOREIGN KEY (`calendarEntryId`) REFERENCES `calendarEntries` (`id`) ON DELETE CASCADE,
  CONSTRAINT `involvedParties_companyId_fk` FOREIGN KEY (`companyId`) REFERENCES `companies` (`id`) ON DELETE CASCADE,
  CONSTRAINT `involvedParties_personId_fk` FOREIGN KEY (`personId`) REFERENCES `persons` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=2467 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `calendarEntries` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1078 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `events` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `calendarEntryId` int(11) NOT NULL,
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `creationTimestamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `lastModificationTimestamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `status` enum('OPEN','SUCCESSFUL','UNSUCCESSFUL','CANCELED') COLLATE utf8_unicode_ci DEFAULT NULL,
  `startTimestamp` timestamp NULL DEFAULT NULL,
  `startDate` date DEFAULT NULL,
  `endTimestamp` timestamp NULL DEFAULT NULL,
  `endDate` date DEFAULT NULL,
  `location` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `notes` blob,
  PRIMARY KEY (`id`),
  UNIQUE KEY `events_calendarEntryId_uk` (`calendarEntryId`),
  KEY `events_status_endTimestamp_endDate_k` (`status`,`endTimestamp`,`endDate`),
  CONSTRAINT `events_calendarEntryId_fk` FOREIGN KEY (`calendarEntryId`) REFERENCES `calendarEntries` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=117 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `todos` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `calendarEntryId` int(11) NOT NULL,
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `creationTimestamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `lastModificationTimestamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `closingTimestamp` timestamp NULL DEFAULT NULL,
  `status` enum('NEW','ACTIVE','ON HOLD','SUCCESSFUL','UNSUCCESSFUL','CANCELED') COLLATE utf8_unicode_ci DEFAULT NULL,
  `assigneePersonId` int(11) DEFAULT NULL,
  `priority` int(11) DEFAULT NULL,
  `startTimestamp` timestamp NULL DEFAULT NULL,
  `startDate` date DEFAULT NULL,
  `dueTimestamp` timestamp NULL DEFAULT NULL,
  `dueDate` date DEFAULT NULL,
  `location` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `notes` blob,
  PRIMARY KEY (`id`),
  UNIQUE KEY `todos_calendarEntryId_uk` (`calendarEntryId`),
  KEY `todos_assigneePersonId_k` (`assigneePersonId`),
  KEY `todos_status_lastModificationTimestamp_k` (`status`,`lastModificationTimestamp`),
  CONSTRAINT `todos_assigneePersonId_fk` FOREIGN KEY (`assigneePersonId`) REFERENCES `persons` (`id`) ON DELETE SET NULL,
  CONSTRAINT `todos_calendarEntryId_fk` FOREIGN KEY (`calendarEntryId`) REFERENCES `calendarEntries` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=959 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

show table status;
+----------------------------------------------+--------+---------+------------+-------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| Name                                         | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length  | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation       | Checksum | Create_options | Comment |
+----------------------------------------------+--------+---------+------------+-------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| calendarEntries                              | InnoDB |      10 | Compact    |  1015 |             64 |       65536 |                0 |            0 |   8388608 |           1078 | 2008-12-11 23:27:16 | NULL                | NULL       | utf8_unicode_ci |     NULL |                |         | 
| events                                       | InnoDB |      10 | Compact    |    85 |            578 |       49152 |                0 |        32768 |   8388608 |            117 | 2008-12-11 23:27:16 | NULL                | NULL       | utf8_unicode_ci |     NULL |                |         | 
| involvedParties                              | InnoDB |      10 | Compact    |  2557 |             44 |      114688 |                0 |       229376 |   8388608 |           2467 | 2008-12-11 23:27:16 | NULL                | NULL       | utf8_unicode_ci |     NULL |                |         | 
| todos                                        | InnoDB |      10 | Compact    |   865 |            189 |      163840 |                0 |        49152 |   8388608 |            959 | 2008-12-11 23:27:16 | NULL                | NULL       | utf8_unicode_ci |     NULL |                |         | 
+----------------------------------------------+--------+---------+------------+-------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
[4 Mar 2009 18:42] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.1.31 (or 5.1.32 that should appear soon). On recent 5.1.33 I've got the expected join order:

mysql> explain SELECT * FROM     involvedParties STRAIGHT_JOIN (     calendarEntries     LEFT JOIN events ON events.calendarEntryId = calendarEntries.id     LEFT JOIN todos ON todos.calendarEntryId = calendarEntries.id ) ON calendarEntries.id = involvedParties.calendarEntryId\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: involvedParties
         type: ALL
possible_keys: involvedParties_calendarEntryId_listOrder_uk
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: calendarEntries
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.involvedParties.calendarEntryId
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: events
         type: eq_ref
possible_keys: events_calendarEntryId_uk
          key: events_calendarEntryId_uk
      key_len: 4
          ref: test.calendarEntries.id
         rows: 1
        Extra: 
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: todos
         type: eq_ref
possible_keys: todos_calendarEntryId_uk
          key: todos_calendarEntryId_uk
      key_len: 4
          ref: test.calendarEntries.id
         rows: 1
        Extra: 
4 rows in set (0.00 sec)

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.1.33-debug | 
+--------------+
1 row in set (0.00 sec)
[4 Mar 2009 19:03] Jaka Jančar
Valeriy,

do you also get the same join order when using just JOIN (instead of STRAIGHT_JOIN)?

Because if you do, that doesn't mean STRAIGHT_JOIN works, it just mean that the optimizer chose the optimal execution plan anyways. And it often does so for me too!

If you check my related thread in the forums[1], you'll see that I've had problems with the optimizer sometimes incorrectly optimizing the query. I thought I fixed it with STRAIGHT_JOIN, but it turns out I really didn't.

(I have also posted SHOW TABLE STATUS from when the optimizer correctly optimizes the query and from when it doesn't, but that's a separate issue. I just want to force a certain order now explicitly using STRAIGHT_JOIN so it always works the same.)

[1] http://forums.mysql.com/read.php?24,246290,247046#msg-247046
[31 Mar 2009 15:45] Valeriy Kravchuk
Actually, it is easy to create an example where STRAIGHT_JOIN does not work as documented in the manual, http://dev.mysql.com/doc/refman/5.1/en/join.html:

"STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table."

Look:

mysql> create table t1(c1 int primary key, c2 char(10));
Query OK, 0 rows affected (0.40 sec)

mysql> create table t2(c1 int primary key, c2 char(10), ref_t1 int);
Query OK, 0 rows affected (0.40 sec)

mysql> create table t3(c1 int primary key, c2 char(10), ref_t1 int);
Query OK, 0 rows affected (0.41 sec)

mysql> create table t4(c1 int primary key, c2 char(10), ref_t1 int);
Query OK, 0 rows affected (0.82 sec)

mysql> insert into t1 values(1,'a');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t2 values(1,'a', 1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t3 values(1,'a', 1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t4 values(1,'a', 1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t4 values(2,'a', 2);
Query OK, 1 row affected (0.01 sec)

mysql> insert into t4 values(3,'a', 3);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t4 values(4,'a', 4);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values(2,'b');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values(3,'c');
Query OK, 1 row affected (0.00 sec)

Number of rows in the tables:

t1: 3
t2: 1
t3: 1
t4: 4

Now, SELECT with joins:

mysql> explain select * from t4 join (t1 join t3 on t3.ref_t1 = t1.c1 join t2 on t2.ref_t1 = t1.c1) on t4.ref_t1 = t1.c1;
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t3    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t2    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t1    | const  | PRIMARY       | PRIMARY | 4       | const |    1 |             | 
|  1 | SIMPLE      | t4    | ALL    | NULL          | NULL    | NULL    | NULL  |    4 | Using where | 
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
4 rows in set (0.00 sec)

And STRAIGHT_JOIN is ignored:

mysql> explain select * from t4 straight_join (t1 join t3 on t3.ref_t1 = t1.c1 join t2 on t2.ref_t1 = t1.c1) on t4.ref_t1 = t1.c1;
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t3    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t2    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t1    | const  | PRIMARY       | PRIMARY | 4       | const |    1 |             | 
|  1 | SIMPLE      | t4    | ALL    | NULL          | NULL    | NULL    | NULL  |    4 | Using where | 
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
4 rows in set (0.00 sec)

mysql> explain select straight_join * from t4 join (t1 join t3 on t3.ref_t1 = t1.c1 join t2 on t2.ref_t1 = t1.c1) on t4.ref_t1 = t1.c1;
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t3    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t2    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t1    | const  | PRIMARY       | PRIMARY | 4       | const |    1 |             | 
|  1 | SIMPLE      | t4    | ALL    | NULL          | NULL    | NULL    | NULL  |    4 | Using where | 
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
4 rows in set (0.00 sec)

Without nested joins it is NOT ignored:

mysql> explain select * from t4 straight_join t1 on t4.ref_t1 = t1.c1 join t3 on t3.ref_t1 = t1.c1 join t2 on t2.ref_t1 = t1.c1;
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t3    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t2    | system | NULL          | NULL    | NULL    | NULL  |    1 |             | 
|  1 | SIMPLE      | t4    | ALL    | NULL          | NULL    | NULL    | NULL  |    4 | Using where | 
|  1 | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | const |    1 | Using where | 
+----+-------------+-------+--------+---------------+---------+---------+-------+------+-------------+
4 rows in set (0.00 sec)

As you can see, t4 is read BEFORE t1 (as we requested) in this case.

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.1.34-debug | 
+--------------+
1 row in set (0.00 sec)

This should be at least clearly documented.
[10 Feb 2012 3:11] Paul DuBois
Noted in 5.6.5 changelog.

The optimizer mishandled STRAIGHT_JOIN used with nested joins; for
example, by not evaluating tables in the specified order.