Bug #18454 SQL_CALC_FOUND_ROWS should calculate rows during tmp table builds/filesorts
Submitted: 23 Mar 2006 13:26 Modified: 24 Apr 2006 8:26
Reporter: Domas Mituzas
Status: Verified
Category:Server: Optimizer Severity:S4 (Feature request)
Version:4.1, 5.0, 5.1 OS:Linux (Linux)
Assigned to: Evgeny Potemkin Target Version:
Triage: Triaged: D5 (Feature request)

[23 Mar 2006 13:26] Domas Mituzas
Description:
SQL_CALC_FOUND_ROWS calculates rows after they have been filesorted or grouped in
temporary table (and therefore accounted). 

As SQL_CALC_FOUND_ROWS is usually used with LIMIT, ORDER BY .. LIMIT filesort
optimizations are not made in such cases.

Moreover, SELECT expressions are evaluated for all records, therefore adding more
performance burden for SQL_CALC_FOUND_ROWS.

How to repeat:
Any dataset:

Destroyed ORDER BY .. LIMIT optimization:

mysql> select sql_calc_found_rows last_name from contacts group by c_id order by
last_name limit 1;
+----------------------------------------------------------------+
| last_name                                                      |
+----------------------------------------------------------------+
| 0.000100455246959305598275502546812276705168187618255615234375 |
+----------------------------------------------------------------+
1 row in set (19.38 sec)

Where optimization applies:
mysql> select last_name from contacts group by c_id order by last_name limit 1;
+----------------------------------------------------------------+
| last_name                                                      |
+----------------------------------------------------------------+
| 0.000100455246959305598275502546812276705168187618255615234375 |
+----------------------------------------------------------------+
1 row in set (3.13 sec)

Useless execution of SELECT expressions for SQL_CALC_FOUND_ROWS:

No count, fast:
mysql> select sleep(0.0001),last_name from contacts group by c_id limit 1;
+---------------+------------------+
| sleep(0.0001) | last_name        |
+---------------+------------------+
| 0             | 0.72634625688786 |
+---------------+------------------+
1 row in set (0.01 sec)

Count, slow:
mysql> select sql_calc_found_rows sleep(0.0001),last_name from contacts group by c_id
limit 1;
+---------------+------------------+
| sleep(0.0001) | last_name        |
+---------------+------------------+
| 0             | 0.72634625688786 |
+---------------+------------------+
1 row in set (11.66 sec)

Calc without expression:
mysql> select sql_calc_found_rows last_name from contacts group by c_id limit 1;
+------------------+
| last_name        |
+------------------+
| 0.72634625688786 |
+------------------+
1 row in set (0.51 sec)

Suggested fix:
1. Use filesort / temporary table data for SQL_CALC_FOUND_ROWS if no HAVING rules exist.
2. Do not evaluate expressions for SQL_CALC_FOUND_ROWS, where those are not sent
[17 Aug 2006 13:56] Kay Röpke
A related case just bit me:

Suppose you do:
select sql_calc_found_rows field_in_index, not_in_index from tab where field_in_index =
42 and type_in_index=23 order by date_in_index desc limit 0,10;

For potentially big result sets (which are clipped by limit) this results in a read of
all rows in order to calculate the found_rows instead of simply counting the matching
index values.
In these cases it's better to actually do a separate select count(*) query afterwards,
even though that might imply having to lock. It's several magnitudes faster.
[13 Jan 2007 6:22] Guillaume Boissiere
On a table with 50,000 records, I found that doing:

SELECT count(0) FROM mytable WHERE ...   (to get the total count)
SELECT * FROM mytable WHERE ... LIMIT 50

was about 3 times faster than doing:

SELECT SQL_CALC_FOUND_ROWS * FROM mytable WHERE ... LIMIT 50
SELECT FOUND_ROWS()

so it is probably not worth using on large tables until this bug is fixed.
Of course, you should benchmark it on your own application to be sure.
[23 Mar 2009 21:57] bill wu
Work fine with Limit, but when I add the order by, it will be slow.