Bug #40006 Performance depends on functions used?
Submitted: 13 Oct 2008 15:04 Modified: 22 Oct 2008 10:56
Reporter: Andre Timmer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:5.0.18 OS:Any
Assigned to: CPU Architecture:Any

[13 Oct 2008 15:04] Andre Timmer
Description:
Performance depends on functions used?

How to repeat:
create table profile2 (
  context varchar(200) NOT NULL
, value   longtext
) engine myisam;

-- inserting 80.000 rows
delimiter //
create procedure sp_test()
begin
  declare x integer default 0;
  --
  REPEAT 
      SET x = x + 1; 
      insert into profile2 values ('cn=jane,ou=org,c=nl' , '200809011200');
      insert into profile2 values ('cn=chris,ou=org,c=nl', '200809011215');
      insert into profile2 values ('cn=blue,ou=org,c=nl' , '200809011400');
      insert into profile2 values ('cn=lisa,ou=org,c=nl' , '200809011405');
      insert into profile2 values ('cn=rene,ou=org,c=nl' , '200809011600');
      insert into profile2 values ('cn=frans,ou=org,c=nl', '200809011710');
      insert into profile2 values ('cn=bob,ou=org,c=nl'  , '200809011905');
      insert into profile2 values ('cn=marie,ou=org,c=nl', '200809023000');      
   UNTIL x >= 10000 END REPEAT;
   
   commit;
end;
//
delimiter ;

call sp_test();
select concat('Rows inserted: ', count(*)) info  from profile2;

-- slow select
select context
,      substr(value, 1, 6) yyyymm
,      count(*) aantal
from   profile2
group by 1,2;

+----------------------+--------+--------+
| context              | yyyymm | aantal |
+----------------------+--------+--------+
| cn=blue,ou=org,c=nl  |        |  10000 |
| cn=bob,ou=org,c=nl   |        |  10000 |
| cn=chris,ou=org,c=nl |        |  10000 |
| cn=frans,ou=org,c=nl |        |  10000 |
| cn=jane,ou=org,c=nl  |        |  10000 |
| cn=lisa,ou=org,c=nl  |        |  10000 |
| cn=marie,ou=org,c=nl |        |  10000 |
| cn=rene,ou=org,c=nl  |        |  10000 |
+----------------------+--------+--------+
8 rows in set (13.09 sec)

-- fast select
select context
,      substr(concat(value, ' '), 1, 6) yyyymm
,      count(*) aantal
from   profile2
group by 1,2;

+----------------------+--------+--------+
| context              | yyyymm | aantal |
+----------------------+--------+--------+
| cn=blue,ou=org,c=nl  | 200809 |  10000 |
| cn=bob,ou=org,c=nl   | 200809 |  10000 |
| cn=chris,ou=org,c=nl | 200809 |  10000 |
| cn=frans,ou=org,c=nl | 200809 |  10000 |
| cn=jane,ou=org,c=nl  | 200809 |  10000 |
| cn=lisa,ou=org,c=nl  | 200809 |  10000 |
| cn=marie,ou=org,c=nl | 200809 |  10000 |
| cn=rene,ou=org,c=nl  | 200809 |  10000 |
+----------------------+--------+--------+
8 rows in set (8.79 sec)

The second select is 30% faster.

Suggested fix:
Both selects should be of the same performance.

Column value is of type longtext.

Maybe for sorting the engine thinks it needs much space.
It should detect that 6 characters is enough.
- substr(value, 1, 6)              --> sql engine extracts less information    
- substr(concat(value, ' '), 1, 6) --> sql engine extracts more information
[13 Oct 2008 15:23] MySQL Verification Team
Thank you for the bug report. Could you please test with latest released version 5.0.18 is quite older. Thanks.
[22 Oct 2008 10:56] Andre Timmer
PC: Windows 2000

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.0.67-community MySQL Community Edition (GPL)

.. creating table and procedure

mysql> call sp_test();
Query OK, 0 rows affected (4.19 sec)

mysql> select concat('Rows inserted: ', count(*)) info  from profile2;
+----------------------+
| info                 |
+----------------------+
| Rows inserted: 80000 |
+----------------------+
1 row in set (0.03 sec)

mysql> -- slow select
mysql> select context
    -> ,      substr(value, 1, 6) yyyymm
    -> ,      count(*) aantal
    -> from   profile2
    -> group by 1,2;
+----------------------+--------+--------+
| context              | yyyymm | aantal |
+----------------------+--------+--------+
| cn=blue,ou=org,c=nl  | 200809 |  10000 |
| cn=bob,ou=org,c=nl   | 200809 |  10000 |
| cn=chris,ou=org,c=nl | 200809 |  10000 |
| cn=frans,ou=org,c=nl | 200809 |  10000 |
| cn=jane,ou=org,c=nl  | 200809 |  10000 |
| cn=lisa,ou=org,c=nl  | 200809 |  10000 |
| cn=marie,ou=org,c=nl | 200809 |  10000 |
| cn=rene,ou=org,c=nl  | 200809 |  10000 |
+----------------------+--------+--------+
8 rows in set (0.38 sec)

mysql> select context
    -> ,      substr(concat(value, ' '), 1, 6) yyyymm
    -> ,      count(*) aantal
    -> from   profile2
    -> group by 1,2;
+----------------------+--------+--------+
| context              | yyyymm | aantal |
+----------------------+--------+--------+
| cn=blue,ou=org,c=nl  | 200809 |  10000 |
| cn=bob,ou=org,c=nl   | 200809 |  10000 |
| cn=chris,ou=org,c=nl | 200809 |  10000 |
| cn=frans,ou=org,c=nl | 200809 |  10000 |
| cn=jane,ou=org,c=nl  | 200809 |  10000 |
| cn=lisa,ou=org,c=nl  | 200809 |  10000 |
| cn=marie,ou=org,c=nl | 200809 |  10000 |
| cn=rene,ou=org,c=nl  | 200809 |  10000 |
+----------------------+--------+--------+
8 rows in set (0.45 sec)

Difference is negligable on Windows 2000 using MySQL 5.0.67 so i'm closing this call.