Bug #111570 Looking for inexistent data reports 0 rows examined
Submitted: 26 Jun 2023 20:03 Modified: 28 Jun 2023 0:31
Reporter: Jean-François Gagné Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.33, 5.7.42 OS:Any
Assigned to: CPU Architecture:Any

[26 Jun 2023 20:03] Jean-François Gagné
Description:
Hi,

when selecting inexistent rows, rows examined is reported as 0.  I would expect this to be as many rows as were selected, matching the status Handler_read_key.  See how to repeat for details.

This affects the slow log, the Performance Schema table events_statements_summary_by_digest, the InnoDB Metric dml_reads and the global status Innodb_rows_read.  This does not affect the Performance Schema table table_io_waits_summary_by_table.  All the details are in How to repeat.

How to repeat is for 8.0.33, but the same problem affects 5.7.42.

Another related bug is Bug#104659, in which rows examined is less than what is expected because of Index Condition Pushdown.

Many thanks for looking into this,

Jean-François Gagné

How to repeat:
# Create a sandbox with the slow query log enabled in a table.
dbdc="-c slow_query_log=ON -c log_output=TABLE -c long_query_time=0"
dbdc="$dbdc -c innodb_buffer_pool_size=$((1024*1024*1024))"
dbdeployer deploy single mysql_8.0.33 $dbdc

# Create the schema.
./use <<< "
   CREATE DATABASE test_jfg;
   CREATE TABLE test_jfg.t (
      id INTEGER NOT NULL PRIMARY KEY,
      a INTEGER NOT NULL,
      b INTEGER NOT NULL,
      KEY (a, b))"

# Insert 1000 rows in the table with skipping every other values for id.
# This is done in such a way that the a column is set to id/2
#   and the b column is set to id.
seq 1 1000 |
  awk '{print "(" $1*2, $1, $1*2  ")"}' |
  tr " " "," | paste -s -d "$(printf ',%.0s' {1..100})\n" |
  sed -e 's/.*/INSERT INTO t values &;/' |
  ./use test_jfg
  
# Recompute statistics on the table.
# If not done, things could get weird, but I do not have time to explain why here.
./use test_jfg <<< "ANALYZE TABLE t" > /dev/null

# Let's first select 5 existing rows by Primary Key, which leads to what we expect.
./use test_jfg --table <<< "
  TRUNCATE mysql.slow_log;
  FLUSH STATUS;
  SELECT * FROM t WHERE id IN (2,4,6,8,10);
  SHOW STATUS LIKE 'Handler_read_%';
  SELECT rows_sent, rows_examined, sql_text
    FROM mysql.slow_log WHERE sql_text LIKE '%SELECT%'"
+----+---+----+
| id | a | b  |
+----+---+----+
|  2 | 1 |  2 |
|  4 | 2 |  4 |
|  6 | 3 |  6 |
|  8 | 4 |  8 |
| 10 | 5 | 10 |
+----+---+----+
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 5     |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
+-----------+---------------+------------------------------------------+
| rows_sent | rows_examined | sql_text                                 |
+-----------+---------------+------------------------------------------+
|         5 |             5 | SELECT * FROM t WHERE id IN (2,4,6,8,10) |
+-----------+---------------+------------------------------------------+

# Now let's select 5 inexistent rows, and event if we called the storage engine 5 times, we have 0 rows examined, which is not what we expect.
./use test_jfg --table <<< "
  TRUNCATE mysql.slow_log;
  FLUSH STATUS;
  SELECT * FROM t WHERE id IN (1,3,5,7,9);
  SHOW STATUS LIKE 'Handler_read_%';
  SELECT rows_sent, rows_examined, sql_text
    FROM mysql.slow_log WHERE sql_text LIKE '%SELECT%'"
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 5     |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
+-----------+---------------+-----------------------------------------+
| rows_sent | rows_examined | sql_text                                |
+-----------+---------------+-----------------------------------------+
|         0 |             0 | SELECT * FROM t WHERE id IN (1,3,5,7,9) |
+-----------+---------------+-----------------------------------------+

# This also affects the Performance Schema table events_statements_summary_by_digest.
./use performance_schema <<< "TRUNCATE events_statements_summary_by_digest"; \
  ./use test_jfg <<< "SELECT * FROM t WHERE a IN (1,2,3,4,5) AND b IN (1,3,5,7,9)"; \
  ./use performance_schema <<< "
    SELECT DIGEST_TEXT, SUM_TIMER_WAIT, COUNT_STAR, SUM_ROWS_SENT, SUM_ROWS_EXAMINED
      FROM events_statements_summary_by_digest
      WHERE DIGEST_TEXT like '%FROM%t%WHERE%a%'\G"
*************************** 1. row ***************************
      DIGEST_TEXT: SELECT * FROM `t` WHERE `a` IN (...) AND `b` IN (...)
   SUM_TIMER_WAIT: 284550000
       COUNT_STAR: 1
    SUM_ROWS_SENT: 0
SUM_ROWS_EXAMINED: 0

# And the InnoDB Metric dml_reads.
./use <<< "SET GLOBAL innodb_monitor_enable = dml_reads"; \
  sql="SELECT COUNT FROM INNODB_METRICS WHERE NAME = 'dml_reads'"; \
  c1="$(./use -N information_schema <<< "${sql}")"; \
  ./use test_jfg <<< "SELECT * FROM t WHERE a IN (1,2,3,4,5) AND b IN (1,3,5,7,9)"; \
  c2="$(./use -N information_schema <<< "${sql}")"; \
  echo $c2 - $c1 = $(($c2 - $c1))
0 - 0 = 0

# And the global status Innodb_rows_read.
sql="SHOW GLOBAL STATUS LIKE 'Innodb_rows_read'"; \
  c1="$(./use -N <<< "${sql}" | awk '{print $2}')"; \
  ./use test_jfg <<< "SELECT * FROM t WHERE a IN (1,2,3,4,5) AND b IN (1,3,5,7,9)"; \
  c2="$(./use -N <<< "${sql}" | awk '{print $2}')"; \
  echo $c2 - $c1 = $(($c2 - $c1))
5 - 5 = 0

# But this does not affect the Performance Schema table table_io_waits_summary_by_table.
./use performance_schema <<< "TRUNCATE table_io_waits_summary_by_table"; \
  ./use test_jfg <<< "SELECT * FROM t WHERE a IN (1,2,3,4,5) AND b IN (1,3,5,7,9)"; \
  ./use performance_schema <<< "
    SELECT COUNT_STAR
      FROM table_io_waits_summary_by_table
      WHERE OBJECT_SCHEMA = 'test_jfg' AND OBJECT_NAME = 't'\G"
*************************** 1. row ***************************
COUNT_STAR: 25

# And this can easily grow to large proportions when the where-clause contains a AND of many IN.
./use test_jfg --table <<< "
  TRUNCATE mysql.slow_log;
  FLUSH STATUS;
  SELECT * FROM t WHERE a IN (1,2,3,4) AND b IN (1,3,5,7);
  SHOW STATUS LIKE 'Handler_read_%';
  SELECT rows_sent, rows_examined, sql_text
    FROM mysql.slow_log WHERE sql_text LIKE '%SELECT%'"
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 16    |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
+-----------+---------------+---------------------------------------------------------+
| rows_sent | rows_examined | sql_text                                                |
+-----------+---------------+---------------------------------------------------------+
|         0 |             0 | SELECT * FROM t WHERE a IN (1,2,3,4) AND b IN (1,3,5,7) |
+-----------+---------------+---------------------------------------------------------+

# Another example with larger IN.
a="$(seq 1 16   | paste -s -d ,)"; \
b="$(seq 1 2 32 | paste -s -d ,)"; \
./use test_jfg --table <<< "
  TRUNCATE mysql.slow_log;
  FLUSH STATUS;
   SELECT * FROM t WHERE a IN ($a) and b IN ($b);
  SHOW STATUS LIKE 'Handler_read_%';
  SELECT rows_sent, rows_examined, sql_text
    FROM mysql.slow_log WHERE sql_text LIKE '%SELECT%'"
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 256   |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------+
| rows_sent | rows_examined | sql_text                                                                                                                  |
+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------+
|         0 |             0 | SELECT * FROM t WHERE a IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16) and b IN (1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31) |
+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------+
[27 Jun 2023 6:31] MySQL Verification Team
Hello Jean-François,

Thank you for the report and feedback.
Verified as described.

regards,
Umesh
[28 Jun 2023 0:31] Jean-François Gagné
Related blog post on the subject:
https://jfg-mysql.blogspot.com/2023/06/rows-examined-blindspot-when-looking-for-non-existi...