Bug #52081 Index hint/forcing for GROUP BY / ORDER BY is broken
Submitted: 16 Mar 2010 8:38 Modified: 13 Aug 2010 2:47
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.41, 5.1.44, 5.6.99 OS:Any
Assigned to: Jørgen Løland CPU Architecture:Any

[16 Mar 2010 8:38] Ole John Aske
Description:
Bug found during code inspection:

There is a mixup between GROUP BY and ORDER BY in which indexes that should be considdered / allowed during optimization of a query. The bug is a simple mixup of argument 'is_order_by' to create_sort_index() in JOIN::exec

---------------
      /*
	Here we sort rows for ORDER BY/GROUP BY clause, if the optimiser
	chose FILESORT to be faster than INDEX SCAN or there is no 
	suitable index present.
	Note, that create_sort_index calls test_if_skip_sort_order and may
	finally replace sorting with index scan if there is a LIMIT clause in
	the query. XXX: it's never shown in EXPLAIN!
	OPTION_FOUND_ROWS supersedes LIMIT and is taken into account.
      */
      if (create_sort_index(thd, curr_join,
			    curr_join->group_list ? 
			    curr_join->group_list : curr_join->order,
			    curr_join->select_limit,
			    (select_options & OPTION_FOUND_ROWS ?
			     HA_POS_ERROR : unit->select_limit_cnt),
                            curr_join->group_list ? TRUE : FALSE))  // FIXME: Looks like swapped TRUE / FALSE logic?

------------------------------

Above: 'is_order_by' is TRUE if curr_join->group_list is supplied as 'order' argumenet! This does not make sense !!!!

NOTE:
In this case EXPLAIN is useless to investigate index usage as JOIN::exec has its own codepath for SELECT_DESCRIBE. DESCRIBE will call test_if_skip_sort_order() itself to recalculate index usage. The outcome of this calculation *may not* be the same as the index which is actually used to execute the query - Which could be regarded as a bug in itself. Also see bug#43342 which partly concerns incorrect EXPLAIN'ed usage of indexes for order by / group by.

How to repeat:
Fill a test database with 1.000.000 rows:

create table t(pk int primary key) engine = myisam;;
insert into t values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

CREATE TABLE t1 (
  a int NOT NULL,
  b int NOT NULL,
  c int NOT NULL,
  d int NOT NULL,
  PRIMARY KEY (`a`,`b`),
  KEY (c), KEY(d)
) ENGINE=myisam;

insert into t1
select
  t5.pk*10000,
  t1.pk + (t2.pk*10) + (t3.pk*100) + (t4.pk*1000) + (t6.pk*100000),
  t2.pk, t3.pk
 from t t1, t t2, t t3, t t4, t t5, t t6;

--> 1000000 rows inserted.

When specifying no index hint/force the query executes really fast (as expected) as KEY(d)is used by the optimizer.

mysql> select t1.c, t1.d from t1 
    -> order by t1.d limit 10;
+---+---+
| c | d |
+---+---+
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
+---+---+
10 rows in set (0.00 sec)

-- We then tell the optimizer to ignore index 'd' for order by and would
expect the 'order by' to take considderable longer time.

mysql> select t1.c, t1.d from t1 IGNORE INDEX FOR ORDER BY(d)
    -> order by t1.d limit 10;
+---+---+
| c | d |
+---+---+
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
+---+---+
10 rows in set (0.00 sec)

-- 0.00sec is not as expected wo/ index.
-- However, if we change to ignore 'FOR GROUP BY'

mysql> select t1.c, t1.d from t1 IGNORE INDEX FOR GROUP BY(d)
    -> order by t1.d limit 10;
+---+---+
| c | d |
+---+---+
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
+---+---+
10 rows in set (0.32 sec)

-- Which should prove that group by / order by logic has been inverted as described.

Suggested fix:
One liner fix: Swap TRUE / FALSE where the argument is incorrect.

Has tested this in my sandbox and the 'How to repeat' examples now behaves as expected.
[16 Mar 2010 9:42] Sveta Smirnova
Please do not submit the same bug more than once. An existing bug report already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely
to be the same. Because of this, we hope you add your comments to the original bug instead.

Thank you for your interest in MySQL.

Duplicate of bug #45373
[16 Mar 2010 10:07] Ole John Aske
I do not agree that this is a duplicate of bug45373. 

- Bug#45373 basically covers that EXPLAIN *indicate* incorrect index usage.
- This bug (#52081) shows by increased execution time that index hints do not work as expected (Sidenote: If I EXPLAIN my examples they actually claims that the correct index *is* used - Which they aren't)

When JOIN:exec() is executed with the DESCRIBE option it actually recalculate its own index usage *outside* the normal execution path for execute. It may therefore end up explaining another index usage than actually *is* executed· 

As the code path is not the same for EXPLAIN & execute, I would also claim that these bugs are not a duplicate of each other. My proposed fix is also inside the codepath which is only taken by a full execute, and does not affect/fix the EXPLAINed output in bug#45373. Neither would a fix for the incorrect EXPLAIN in bug#45373 fix this bug - or the actual index usage.

A much better design had of course been to always calculate index usage in a common place prior to both EXPLAIN and execute....

Setting status back to 'Open' for a new evaluation based in this comment.
[16 Mar 2010 10:35] Sveta Smirnova
Thank you for the feedback.

You are correct: explain shows same results for all queries. Setting to "Verified'.
[16 Mar 2010 11:42] Jørgen Løland
This was fixed in 6.0 as part of BUG#49867, but not ported back since that bug was only repeatable i 6.0
[16 Mar 2010 11:46] Ole John Aske
It should be simple to correct the TRUE / FALSE logic though in the 5.1 branch.
[17 Mar 2010 9:36] Jørgen Løland
Ole John, yes this should be very easy to fix with minimal risk. Awaiting decision from triage.
[19 Mar 2010 13:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103822

3412 Jorgen Loland	2010-03-19
      Bug#52081: Index hint/forcing for GROUP BY / ORDER BY is broken
      
      Found during code inspection: There is a mixup between GROUP BY 
      and ORDER BY in which indexes that should be considered/allowed 
      during optimization of a query. The bug is a simple mixup of 
      argument 'is_order_by' to create_sort_index() in JOIN::exec().
      
      No test case committed as only problem is increased execution
      time due to wrong plan, and this does not show in EXPLAIN.
     @ sql/sql_select.cc
        Fixed mixup between GROUP BY/ORDER BY for JOIN::exec() call to 
        create_sort_index().
[19 Mar 2010 14:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103848

3131 Jorgen Loland	2010-03-19
      Bug#52081: Index hint/forcing for GROUP BY / ORDER BY is broken
            
      Found during code inspection: There is a mixup between GROUP BY 
      and ORDER BY in which indexes that should be considered/allowed 
      during optimization of a query. The bug is a simple mixup of 
      argument 'is_order_by' to create_sort_index() in JOIN::exec().
            
      No test case included as only problem is increased execution
      time due to wrong plan, and this does not show in EXPLAIN.
     @ sql/sql_select.cc
        Fixed mixup between GROUP BY/ORDER BY for JOIN::exec() call to 
        create_sort_index().
[19 Mar 2010 14:41] Jørgen Løland
Pushed to next-mr-bugfixing.
[19 Mar 2010 14:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103856

3984 Jorgen Loland	2010-03-19 [merge]
      Null-merge BUG#52081 from 5.1-bugteam to mysql-pe. Fix already
      present in 6.0 codebase.
[19 Mar 2010 14:51] Jørgen Løland
Null-merged cs to 6.0
[24 Mar 2010 8:15] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100324081249-yfwol7qtcek6dh7w) (version source revid:alik@sun.com-20100324081113-kc7x1iytnplww91u) (merge vers: 6.0.14-alpha) (pib:16)
[24 Mar 2010 8:18] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100324081159-5b8juv8ldiqwce8v) (version source revid:alik@sun.com-20100324081105-y72rautcea375zxm) (pib:16)
[26 Mar 2010 8:31] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)
[4 Aug 2010 8:06] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804080001-bny5271e65xo34ig) (version source revid:alik@sun.com-20100324081105-y72rautcea375zxm) (merge vers: 5.6.99-m4) (pib:18)
[4 Aug 2010 8:22] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@ibmvm-20100804081533-c1d3rbipo9e8rt1s) (version source revid:alik@sun.com-20100324081105-y72rautcea375zxm) (merge vers: 5.6.99-m4) (pib:18)
[13 Aug 2010 2:47] Paul DuBois
Noted in 5.6.0, 6.0.14 changelogs.

There was a mixup between GROUP BY and ORDER BY concerining which
indexes should be considered or permitted during query optimization.