Bug #42111 JOIN performance degraded
Submitted: 14 Jan 2009 14:12 Modified: 26 May 2009 9:53
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:6.0.8 OS:Windows (Vista 32 bit)
Assigned to: CPU Architecture:Any
Tags: qc

[14 Jan 2009 14:12] Peter Laursen
Description:
I remember I saw some articles etc. about the rewritten optimizer in 6.0.  It should improve the performance of SUBQUERIES.  But it seems (at least in some cases) to have seriously degraded the performance of JOINs

How to repeat:
CREATE TABLE `student` (
  `name` varchar(20) NOT NULL,
  `marks` int(11) default NULL,
  PRIMARY KEY  (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert  into `student`(`name`,`marks`) values ('joe',9);
insert  into `student`(`name`,`marks`) values ('jim',2);
insert  into `student`(`name`,`marks`) values ('jesse',9);
insert  into `student`(`name`,`marks`) values ('judy',10);
insert  into `student`(`name`,`marks`) values ('jill',0);

Now I try two differrenct queries that will both retrieve a list of students wiht 2nd highest 'marks'

-- a subquery
select student.name, student.marks
from student
where marks = (SELECT marks
               FROM student
               GROUP BY marks
               ORDER BY marks desc
               limit 1,1);

-- a join
SELECT *
FROM student.name , student.marks
  JOIN (SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 1,1) as score2
  ON (student.marks = score2.marks);
    
I execute each multiple times (to ensure that index buffers etc. are cached - but NOTE query cache is not enabled!)
.. and I SHOW PROFILE for those queries in servers 5.0.67 and 6.0.8 (server is doing nothing else).  I check for the aggregated (sum of) times with SHOW PROFILE. Both servers are local and 'sending data' do not differ much!

On server 6.0.8 w. join I get results +0.03 seconds for the aggregated sum in SHOW PROFILE in my environment - what is about *20 times as much* as with 5.0.67!! It is 'executing' and 'converting heap to maria' that account for most!
 
(but truly subquery is around 40% faster in 6.0 than 5.0!). 

Suggested fix:
not sure.. maybe this test is not significant with such small table?
[14 Jan 2009 14:15] Peter Laursen
SHOW PROFILE as reported by SQLyog 7.5 profiling feature

Attachment: profile.jpg (image/jpeg, text), 95.54 KiB.

[14 Jan 2009 14:18] Peter Laursen
corrected typo in synopsis
[14 Jan 2009 14:21] Peter Laursen
changing severity from S3 to S2
[14 Jan 2009 15:13] Peter Laursen
Silly mistake .. JOIN query should be (in the first post) like this:

SELECT student.name, student.marks
FROM student
  JOIN (SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 1,1) as score2
  ON (student.marks = score2.marks);
[14 Jan 2009 17:54] Peter Laursen
however I I do this

create temporary table score2 as 
SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 1,1;

SELECT student.name, student.marks
FROM student
  JOIN score2
    ON (student.marks = score2.marks);
[14 Jan 2009 17:55] Peter Laursen
however I I do this

create temporary table score2 as 
SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 1,1;

SELECT student.name, student.marks
FROM student
  JOIN score2
    ON (student.marks = score2.marks);
[14 Jan 2009 18:02] Peter Laursen
however I observe that if I do this

create [temporary] table score2 as 
   SELECT marks
   FROM student
   GROUP BY marks
   ORDER BY marks desc
   limit 1,1;

SELECT student.name, student.marks
FROM student
  JOIN score2
    ON (student.marks = score2.marks);

.. things look normal! So probably an issue with 'subquery in the join' (or whatever the correct term is!)

The use of SELECT .. JOIN (SELECT ...) is essential.  I can retrieve students with the 3 uppermost scores using  

SELECT student.name, student.marks
FROM student
  JOIN (SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 0,3) as score2
  ON (student.marks = score2.marks);

but rewriting SUBQUERY to retrieve same fails

select student.name, student.marks
from student
where marks IN (SELECT marks
               FROM student
               GROUP BY marks
               ORDER BY marks desc
               limit 0,3);
.. returns
Error Code : 1235
This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
[17 Jan 2009 12:12] Peter Laursen
Seems to have been fixed/improved in 6.0.9 (just downloaded from mirror).  Also PROFILING implementation has changed so much in this new version that SQLyog 7.5 (beta) will not work out of the box with this server version! :) Just wait for beta3 :-)
   
But then let us use 'clumb man line':

Enter password: ********
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 34
Server version: 6.0.9-alpha-community MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SELECT VERSION();
+-----------------------+
| VERSION()             |
+-----------------------+
| 6.0.9-alpha-community |
+-----------------------+
1 row in set (0.00 sec)

mysql> set profiling = ON;
Query OK, 0 rows affected (0.00 sec)

mysql> use test;
Database changed
mysql> SELECT student.name , student.marks
    -> FROM student
    ->   JOIN (SELECT marks
    ->               FROM student
    ->               GROUP BY marks
    ->               ORDER BY marks desc
    ->               limit 1,1) as score2
    ->   ON (student.marks = score2.marks);
+-------+-------+
| name  | marks |
+-------+-------+
| jesse |     9 |
| joe   |     9 |
+-------+-------+
2 rows in set (0.00 sec)

mysql> show profiles;
+----------+------------+-------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------+
| Query_ID | Duration   | Query

                 |
+----------+------------+-------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------+
|        1 | 0.00017800 | SELECT DATABASE()

                 |
|        2 | 0.00168250 | SELECT student.name , student.marks
FROM student
  JOIN (SELECT marks
              FROM student
              GROUP BY marks
              ORDER BY marks desc
              limit 1,1) as score2
  ON (student.marks = score2.marks) |
+----------+------------+-------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------+
2 rows in set (0.00 sec)

mysql> select state, duration from Information_Schema.profiling where query_id =
2;
+----------------------+----------+
| state                | duration |
+----------------------+----------+
| starting             | 0.000239 |
| Opening tables       | 0.000127 |
| System lock          | 0.000011 |
| Table lock           | 0.000398 |
| optimizing           | 0.000010 |
| statistics           | 0.000027 |
| preparing            | 0.000025 |
| Creating tmp table   | 0.000164 |
| executing            | 0.000005 |
| Copying to tmp table | 0.000144 |
| Sorting result       | 0.000050 |
| Sending data         | 0.000019 |
| removing tmp table   | 0.000017 |
| Sending data         | 0.000010 |
| init                 | 0.000032 |
| optimizing           | 0.000019 |
| statistics           | 0.000026 |
| preparing            | 0.000021 |
| executing            | 0.000004 |
| Sending data         | 0.000099 |
| end                  | 0.000009 |
| query end            | 0.000005 |
| freeing items        | 0.000108 |
| removing tmp table   | 0.000012 |
| closing tables       | 0.000084 |
| logging slow query   | 0.000009 |
| cleaning up          | 0.000009 |
+----------------------+----------+
27 rows in set (0.00 sec)

mysql>
[17 Jan 2009 12:43] Peter Laursen
ooops .. whne you have 7 mysql servers and 5 SQLyog versions running mistakes happens!  It works fine in SQLyog 7.5 too.  I paste the (aggregated) result from from 6.0.9 as displayed in SQLyog for comparison wiht my first image!
[17 Jan 2009 12:44] Peter Laursen
see post above!

Attachment: hereitisagain.jpg (image/jpeg, text), 95.78 KiB.

[20 Jan 2009 20:17] Valeriy Kravchuk
So, do you think that the problem is solved in 6.0.9?
[20 Jan 2009 20:27] Peter Laursen
It looks so to me.  But I think the people doing the optimizer code in 6.0.x should have the final say!
[23 Mar 2009 18:35] Sergey Petrunya
Hi!

This can't be related to subquery optimizations as they don't handle FROM-clause subqueries at this point (we're working on them also but that code is not in the main branch). 

It could be that is related to some problem in Maria (since it is handling temporary tables). 

Do EXPLAIN outputs from 5.1 and 6.0 differ?
[23 Mar 2009 18:43] Peter Laursen
I do not have the actual version installed any more, so I am afraid I cannot answer (if it was a question for me).

Obviously the server created a disk (Maria) table for this small temporary table with the version where I first reported this (6.0.8). It is not reproducable for me in 6.0.9.

So if the 6.0 optimizer team think they are in control, they can close now I think. And if such unnessecary memory table >disk table conversions still take place in situations they may set as 'verified (and post what further details they have)
[26 May 2009 9:53] Sveta Smirnova
Thank you for the feedback.

As problem solved for you and you can not answer questions, because changed setup I close the report as "Can't repeat" Feel free to reopen it if problem occurs again.