MySQL Bugs: #18454: SQL_CALC_FOUND_ROWS should calculate rows during tmp table builds/filesorts

Bug #18454	SQL_CALC_FOUND_ROWS should calculate rows during tmp table builds/filesorts
Submitted:	23 Mar 2006 12:26	Modified:	25 Jun 2011 0:48
Reporter:	Domas Mituzas	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S4 (Feature request)
Version:	4.1, 5.0, 5.1	OS:	Linux (Linux)
Assigned to:	Evgeny Potemkin	CPU Architecture:	Any

Description:
SQL_CALC_FOUND_ROWS calculates rows after they have been filesorted or grouped in temporary table (and therefore accounted). 

As SQL_CALC_FOUND_ROWS is usually used with LIMIT, ORDER BY .. LIMIT filesort optimizations are not made in such cases.

Moreover, SELECT expressions are evaluated for all records, therefore adding more performance burden for SQL_CALC_FOUND_ROWS.

How to repeat:
Any dataset:

Destroyed ORDER BY .. LIMIT optimization:

mysql> select sql_calc_found_rows last_name from contacts group by c_id order by last_name limit 1;
+----------------------------------------------------------------+
| last_name                                                      |
+----------------------------------------------------------------+
| 0.000100455246959305598275502546812276705168187618255615234375 |
+----------------------------------------------------------------+
1 row in set (19.38 sec)

Where optimization applies:
mysql> select last_name from contacts group by c_id order by last_name limit 1;
+----------------------------------------------------------------+
| last_name                                                      |
+----------------------------------------------------------------+
| 0.000100455246959305598275502546812276705168187618255615234375 |
+----------------------------------------------------------------+
1 row in set (3.13 sec)

Useless execution of SELECT expressions for SQL_CALC_FOUND_ROWS:

No count, fast:
mysql> select sleep(0.0001),last_name from contacts group by c_id limit 1;
+---------------+------------------+
| sleep(0.0001) | last_name        |
+---------------+------------------+
| 0             | 0.72634625688786 |
+---------------+------------------+
1 row in set (0.01 sec)

Count, slow:
mysql> select sql_calc_found_rows sleep(0.0001),last_name from contacts group by c_id limit 1;
+---------------+------------------+
| sleep(0.0001) | last_name        |
+---------------+------------------+
| 0             | 0.72634625688786 |
+---------------+------------------+
1 row in set (11.66 sec)

Calc without expression:
mysql> select sql_calc_found_rows last_name from contacts group by c_id limit 1;
+------------------+
| last_name        |
+------------------+
| 0.72634625688786 |
+------------------+
1 row in set (0.51 sec)

Suggested fix:
1. Use filesort / temporary table data for SQL_CALC_FOUND_ROWS if no HAVING rules exist.
2. Do not evaluate expressions for SQL_CALC_FOUND_ROWS, where those are not sent

A related case just bit me:

Suppose you do:
select sql_calc_found_rows field_in_index, not_in_index from tab where field_in_index = 42 and type_in_index=23 order by date_in_index desc limit 0,10;

For potentially big result sets (which are clipped by limit) this results in a read of all rows in order to calculate the found_rows instead of simply counting the matching index values.
In these cases it's better to actually do a separate select count(*) query afterwards, even though that might imply having to lock. It's several magnitudes faster.

On a table with 50,000 records, I found that doing:

SELECT count(0) FROM mytable WHERE ...   (to get the total count)
SELECT * FROM mytable WHERE ... LIMIT 50

was about 3 times faster than doing:

SELECT SQL_CALC_FOUND_ROWS * FROM mytable WHERE ... LIMIT 50
SELECT FOUND_ROWS()

so it is probably not worth using on large tables until this bug is fixed.
Of course, you should benchmark it on your own application to be sure.

Work fine with Limit, but when I add the order by, it will be slow.

Fixed in 5.6.3

Correction: Fixed in 5.6.2.

not really... I just reproduced it on 5.6.10

see the perf I'm having on MariaDB... maybe u wanna backport some code ;-)
 - https://mariadb.atlassian.net/browse/MDEV-4592

This is still a problem for me.

I am using MYSQL ver 5.7.15 and still seeing this bug.

Any chance it is fixed somewhere?