Bug #9184 slow log listing incorrect value for Rows_Examined
Submitted: 15 Mar 2005 4:58 Modified: 19 Jul 2005 13:09
Reporter: Matthew Lord Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:all OS:all ?
Assigned to: Mark Matthews CPU Architecture:Any

[15 Mar 2005 4:58] Matthew Lord
Description:
Spurious values for Rows_Examined are reported in slow log:
# Time: 050314 23:13:10
# User@Host: root[root] @ localhost []
# Query_time: 2  Lock_time: 0  Rows_sent: 50  Rows_examined: 49446
SELECT SQL_CALC_FOUND_ROWS m.USER_CREATED AS USER_CREATED, m.ID AS ID, m.DATE_CREATED AS DATE_CREATED, m.DATE_MODIFIED AS DATE_MODIFIED, m.ZIP AS ZIP, m.VISIBILITY AS VISIBILITY, m.LISTER_ID AS LISTER_ID, m.SUBJECT as SUBJECT , m.SUBJECT AS SUBJECT, jl.SALARY AS SALARY, jl.SALARY_TYPE AS SALARY_TYPE, jl.EMPLOYMENT_TYPE AS EMPLOYMENT_TYPE, jl.ORIGINATOR_TYPE AS ORIGINATOR_TYPE, jl.COMPANY AS COMPANY, jl.STREET_ADDRESS AS STREET_ADDRESS, jl.CITY AS CITY, jl.STATE AS STATE, jl.PHONE_NUMBER AS PHONE_NUMBER, jl.ALLOW_RESUME AS ALLOW_RESUME, jl.CONTACT_INFORMATION AS CONTACT_INFORMATION, m.DATE_MODIFIED AS DATE_MODIFIED, 50 AS DIST, m.LISTER_TYPE AS DISPLAY_ORDER FROM LISTING m ignore index ( listing_is_deleted_idx ), JOB_LISTING jl WHERE m.IS_DELETED='N' and m.STATUS='Active' AND jl.CONTENT_ID = m.ID and (m.TYPE != 'E') ORDER BY DATE_MODIFIED DESC, DISPLAY_ORDER ASC, DIST ASC limit 0,50;

The explain output looks like this:
+----+-------------+-------+--------+---------------+---------+---------+--------------+-------+-----------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref          | rows  | Extra                       |
+----+-------------+-------+--------+---------------+---------+---------+--------------+-------+-----------------------------+
|  1 | SIMPLE      | m     | ALL    | PRIMARY       | NULL    |    NULL | NULL         | 16482 | Using where; Using filesort |
|  1 | SIMPLE      | jl    | eq_ref | PRIMARY       | PRIMARY |      40 | csc4713.m.ID |     1 |                             |
+----+-------------+-------+--------+---------------+---------+---------+--------------+-------+-----------------------------+

Both tables each have  16482 rows.

This has been tested on linux/x86 and linux/x86_64.

How to repeat:
Install 4.1.10 or 4.0.24 on windows preferably for verification that this is not *nix specific.

Enable the slow log and set long_query_time down to 1 second.

run this query:
SELECT SQL_CALC_FOUND_ROWS m.USER_CREATED AS USER_CREATED, m.ID AS ID, m.DATE_CREATED AS DATE_CREATED, m.DATE_MODIFIED AS DATE_MODIFIED, m.ZIP AS ZIP, m.VISIBILITY AS VISIBILITY, m.LISTER_ID AS LISTER_ID, m.SUBJECT as SUBJECT , m.SUBJECT AS SUBJECT, jl.SALARY AS SALARY, jl.SALARY_TYPE AS SALARY_TYPE, jl.EMPLOYMENT_TYPE AS EMPLOYMENT_TYPE, jl.ORIGINATOR_TYPE AS ORIGINATOR_TYPE, jl.COMPANY AS COMPANY, jl.STREET_ADDRESS AS STREET_ADDRESS, jl.CITY AS CITY, jl.STATE AS STATE, jl.PHONE_NUMBER AS PHONE_NUMBER, jl.ALLOW_RESUME AS ALLOW_RESUME, jl.CONTACT_INFORMATION AS CONTACT_INFORMATION, m.DATE_MODIFIED AS DATE_MODIFIED, 50 AS DIST, m.LISTER_TYPE AS DISPLAY_ORDER FROM LISTING m ignore index ( listing_is_deleted_idx ), JOB_LISTING jl WHERE m.IS_DELETED='N' and m.STATUS='Active' AND jl.CONTENT_ID = m.ID and (m.TYPE != 'E') ORDER BY DATE_MODIFIED DESC, DISPLAY_ORDER ASC, DIST ASC limit 0,50;

Examine the end of the slow log.
[15 Mar 2005 23:36] Matthew Lord
I tested this with 4.0.24 on x86 linux with the same results.
[28 Mar 2005 17:48] MySQL Verification Team
I was able to repeat using the debug server and the machine under
high load for to get a query ~2 seconds.

Time                 Id Command    Argument
mysqld-debug, Version: 4.1.11-debug-log. started with:
Tcp port: 3306  Unix socket: MySQL
Time                 Id Command    Argument
# Time: 050327 14:44:11
# User@Host: root[root] @ localhost [127.0.0.1]
# Query_time: 2  Lock_time: 0  Rows_sent: 50  Rows_examined: 49446
use t;
SELECT SQL_CALC_FOUND_ROWS m.USER_CREATED AS USER_CREATED, m.ID AS ID,
[19 Apr 2005 19:43] Ingo Strüwing
The factor of three times row count results from the ORDER BY clause. First, all rows from LISTING are read to collect the sort keys. Then every row from LISTING is read and the corresponding row from JOB_LISTING is fetched by key.

Now my question is: what is the bug to be fixed?
[19 Apr 2005 19:56] Matthew Lord
Hi Ingo,

The Rows_Examined value should indicate how many rows from table A need to be read, not how many times those rows are examined.  It makes no logical sense to have Rows_Examined for a table higher than the number of rows.

I'll also note that I have not noticed this behavior before on any queries of a similar type.  You are saying that every time a SELECT is done that joins tables together and orders the results, the Rows_Examined for the first table will be multiplied by the number of tables joined?  I have not noticed this before.
[19 Apr 2005 20:00] Matthew Lord
Another note:

I haven't noticed this before because I'm typically using EXPLAIN to see how many
Rows_Examined would be for a query.  The EXPLAIN output is correct IMO for the 
query mentioned in the bug report.  Why would a different value be used in the slow
log output?  Why wouldn't they at least be consistent one way or the other?

Best Regards
[20 Apr 2005 20:26] Ingo Strüwing
In my opinion, EXPLAIN shows an estimate of rows to be looked at by each key and condition. 'Rows_examined', however, counts every access of a row exactly. This cannot be known in advance. There might be many cases where these values match. But in general they don't.

If you want to have these values the same, then please ask Brian to assign the bug to someone who can make such not-so-minor design decisions.

Best regards.
[19 May 2005 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".