MySQL Bugs: #104421: Bad plan for WHERE customer_id = @cid ORDER BY id LIMIT 10 with compound PK.

Bug #104421	Bad plan for WHERE customer_id = @cid ORDER BY id LIMIT 10 with compound PK.
Submitted:	26 Jul 2021 19:58	Modified:	29 Jul 2021 7:03
Reporter:	Jean-François Gagné	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	8.0.26, 5.7.35	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Hi,

I am getting a bad query plan for a query SELECT ... WHERE customer_id = @cid ORDER BY id LIMIT 10.  I would expect the index on (customer_id, id) to be chosen for serving this query, but the optimiser chooses the index on (id).  Some more informations: id is an auto-increment and the primary key is compound on (customer_id, another_column, id).  See How to repeat for details.

My guess for explaining the the bad plan is a problem with the estimate on rows examined.  I would expect the estimate to be 10 (because of the matching index and LIMIT), but I am getting a much larger value in EXPLAIN.  See How to repeat for details.

Also about the bad estimate on rows examined, I get the expected estimate for a query ORDER BY id LIMIT 10 without a WHERE clause, but I am getting a bad estimate when adding WHERE id > @id.  I am mentioning this because I think this might be related to this bug.  See How to repeat for details.

When the PRIMARY KEY is on id (with an index on customer_id), I am not getting a bad plan, but still getting a bad estimate for rows examined.  See How to repeat for details.

How to repeat is for MySQL 8.0.26, but I am getting similar results for 5.7.35, see first comment for details.

Many thanks for looking into this,

Jean-François Gagné

How to repeat:
# Initialise a sandbox.
dbdeployer deploy single mysql_8.0.26

# Load a dump similar to what is in prod (without indexes for fast load).

# Create indexes and add auto-increment.
alter table p add index(id), modify column id bigint unsigned NOT NULL AUTO_INCREMENT, add index(customer_id, id);

# Table size.
-rw-r----- 1 xxx xxx 15G Jul 26 17:33 msb_mysql_8_0_26/data/test_jfg/p.ibd

# Show the table structure.
mysql [localhost:8026] {msandbox} (test_jfg) > show create table p\G
*************************** 1. row ***************************
       Table: p
Create Table: CREATE TABLE `p` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `customer_id` int unsigned NOT NULL,
  `object_id` bigint unsigned NOT NULL,
  `type_id` tinyint unsigned NOT NULL,
  `details` text,
  PRIMARY KEY (`customer_id`,`object_id`,`id`),
  KEY `id` (`id`),
  KEY `customer_id` (`customer_id`,`id`)
) ENGINE=InnoDB AUTO_INCREMENT=344801716 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

# The plan for the query is using the wrong index.
mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id, object_id, details FROM p WHERE customer_id = @cid ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: index
possible_keys: PRIMARY,customer_id
          key: id
      key_len: 8
          ref: NULL
         rows: 782
     filtered: 1.28
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

# And the query is horribly slow (interrupted after 10 seconds and scanned a lot of rows).
mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; set max_execution_time = 10000; flush local status; SELECT id, object_id, details FROM p WHERE customer_id = @cid ORDER BY id LIMIT 10; pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

ERROR 3024 (HY000): Query execution was interrupted, maximum statement execution time exceeded
Default pager wasn't set, using stdout.
+-----------------------+---------+
| Variable_name         | Value   |
+-----------------------+---------+
| Handler_read_first    | 1       |
| Handler_read_key      | 1       |
| Handler_read_last     | 0       |
| Handler_read_next     | 1195087 |
| Handler_read_prev     | 0       |
| Handler_read_rnd      | 0       |
| Handler_read_rnd_next | 0       |
+-----------------------+---------+
7 rows in set (0.00 sec)

# The good plan, hinting an index, has a wrong estimate for rows (I would expect 10).
mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id, object_id, details FROM p use index (customer_id) WHERE customer_id = @cid ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: ref
possible_keys: customer_id
          key: customer_id
      key_len: 4
          ref: const
         rows: 1281340
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

# And executing the good plan, hinting an index, is fast and only examining 10 rows (as expected).
mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; set max_execution_time = 10000; flush local status; SELECT id, object_id, details FROM p use index (customer_id) WHERE customer_id = @cid ORDER BY id LIMIT 10\G pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.00 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 9     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

# I would expect the good plan to have rows like this.
mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id FROM p ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: index
possible_keys: NULL
          key: id
      key_len: 8
          ref: NULL
         rows: 10
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

# When I add WHERE id > @id to the query with a good estimate, the estimate becomes wrong even though the query is fast.
mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id FROM p WHERE id > @id ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: range
possible_keys: PRIMARY,id,customer_id
          key: id
      key_len: 8
          ref: NULL
         rows: 54202687
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; flush local status; SELECT id FROM p WHERE id > @id ORDER BY id LIMIT 10; pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.00 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 9     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

# Showing it has something to do with the compound PK with still a bad estimate even with the good plan.
mysql [localhost:8026] {msandbox} (test_jfg) > alter table p drop primary key, add primary key (id), drop index id, drop index customer_id, add index(customer_id);
Query OK, 0 rows affected (25 min 40.30 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql [localhost:8026] {msandbox} (test_jfg) > show create table p\G
*************************** 1. row ***************************
       Table: p
Create Table: CREATE TABLE `p` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `customer_id` int unsigned NOT NULL,
  `object_id` bigint unsigned NOT NULL,
  `type_id` tinyint unsigned NOT NULL,
  `details` text,
  PRIMARY KEY (`id`),
  KEY `customer_id` (`customer_id`)
) ENGINE=InnoDB AUTO_INCREMENT=344801716 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id, object_id, details FROM p WHERE customer_id = @cid ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: ref
possible_keys: customer_id
          key: customer_id
      key_len: 4
          ref: const
         rows: 1366822
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; set max_execution_time = 10000; flush local status; SELECT id, object_id, details FROM p WHERE customer_id = @cid ORDER BY id LIMIT 10; pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.00 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 9     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

Suggested fix:
As presented in Description and shown in How to repeat, I think the problem is the bad estimate on rows examined.  I would expect this estimate to be only 10 because of the matching index and LIMIT, but I am getting a much larger value.  If this was fixed, I think we would be getting a much saner plan.

Also note that as presented in Description and shown in How to repeat, I am getting the right estimate when I do ORDER BY id LIMIT 10 without a WHERE clause, but I am getting a bad estimate when adding WHERE id > @id, so this also needs attention (maybe related to this bug, I do not know).

Finally, as adding a WHERE id > @id breaks the estimate from paragraph above, I would also expect that adding AND id > @id to my initial query with the bad plan to also break the estimate, so please make sure to also take this case into account in the fix.

Reproduction for 5.7.35.

Attachment: 5.7.35.txt (text/plain), 7.94 KiB.

Hello Jean-François,

Thank you for the report and feedback.
I tried to reproduce the issue at my end but not seeing any issues. 
I'll be joining the activity log shortly for your reference(only diff being I have modified the *alter* to include PK to match your table definition ). 
Please let me know if I'm missing anything. Thank you.

Sincerely,
Umesh

MySQL Server 8.0.26 test results

Attachment: 104421_8.0.26.log (application/octet-stream, text), 7.32 KiB.

Thanks for looking into that Umesh.

If I am modifying your perl script to this:

for(my $i=1; $i <= 25*1000; $i++) {
        my $txt = 'x' x 10;
        my $customer_id =  1 + int(rand(4294967295-1));
        my $object_id =  1 + int(rand(4294967295-1));
        my $type_id = 1 + int(rand(255 - 1));
        my $j=$i*2-1;
        print "$j,$customer_id,$object_id,$type_id,$txt\n";
        $j++;
        print "$j,123456789,$object_id,$type_id,$txt\n";
}

And if I load this way:

{ echo "BEGIN;";
  perl bug104421_jfg.pl |
    while read l; do echo "INSERT INTO p VALUES(${l/xxxxxxxxxx/\'xxxxxxxxxx\'});"; done |
    sed -e '1000~1000s/^/COMMIT;BEGIN;/'; } |
  ./use test_jfg

After the alter, I am getting the bad plan:

explain SELECT id, object_id, details FROM p WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: index
possible_keys: PRIMARY,customer_id
          key: id
      key_len: 8
          ref: NULL
         rows: 20
     filtered: 50.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

In this case, the bad plan is fast because the rows with customer_id = 123456789 are evenly distributed in the table:

mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; flush local status; SELECT id, object_id, details FROM p WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.00 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 1     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 19    |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.01 sec)

Things change if I mess-up with the row distribution using this script to generate data:

for(my $i=1; $i <= 25*1000; $i++) {
        my $txt = 'x' x 10;
        my $customer_id =  1 + int(rand(4294967295-1));
        my $object_id =  1 + int(rand(4294967295-1));
        my $type_id = 1 + int(rand(255 - 1));
        print "$i,$customer_id,$object_id,$type_id,$txt\n";
}

for(my $i=1; $i <= 25*1000; $i++) {
        my $txt = 'x' x 10;
        my $customer_id =  1 + int(rand(4294967295-1));
        my $object_id =  1 + int(rand(4294967295-1));
        my $type_id = 1 + int(rand(255 - 1));
        my $j=25*1000+$i*2-1;
        print "$j,$customer_id,$object_id,$type_id,$txt\n";
        $j++;
        print "$j,123456789,$object_id,$type_id,$txt\n";
}

We are still having the same bad plan:

mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id, object_id, details FROM p WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: index
possible_keys: PRIMARY,customer_id
          key: id
      key_len: 8
          ref: NULL
         rows: 20
     filtered: 50.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

But the query execution is bad (runtime is ok, but row examined is bad, and runtime would be much worse if the table was bigger and not fitting in RAM):

mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; flush local status; SELECT id, object_id, details FROM p WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.06 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 1     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 25019 |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

And this is why I like the plan below is better, but I do not understand why we need to hint this (as written in the Description, my guess is that rows is bad in below):

mysql [localhost:8026] {msandbox} (test_jfg) > explain SELECT id, object_id, details FROM p use index (customer_id) WHERE customer_id = 123456789 ORDER BY id L
IMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: ref
possible_keys: customer_id
          key: customer_id
      key_len: 4
          ref: const
         rows: 37439
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

And the good plan above leads to a good execution time and rows examined:

mysql [localhost:8026] {msandbox} (test_jfg) > pager cat > /dev/null; flush local status; SELECT id, object_id, details FROM p use index (customer_id) WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G pager; show status like "Handler_read%";
PAGER set to 'cat > /dev/null'
Query OK, 0 rows affected (0.00 sec)

10 rows in set (0.00 sec)

Default pager wasn't set, using stdout.
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 9     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

I hope this helps.  Thanks again Umesh for the time spent on this, and best regards,

Jean-François Gagné

One more thing, maybe this bug similar to Bug#97001, also described in [1] and [2].  I do not think it is a duplicate though as Bug#97001 is the optimizer having 2 bad choices and choosing the worse (avoiding a sort), and this bug is the optimizer clearly choosing a bad plan when a good one exists.

[1]: https://www.percona.com/community-blog/2019/07/29/mysql-optimizer-naughty-aberrations-on-q...

[2]: https://blog.jcole.us/2019/09/30/reconsidering-access-paths-for-index-ordering-a-dangerous...

Thank you, Jean-François.
I'll give it a try today to confirm at my end.

Sincerely,
Umesh

Thank you, Jean-François.

-- 8.0.26
cat bug104421_jfg.pl

for(my $i=1; $i <= 25*1000; $i++) {
        my $txt = 'x' x 10;
        my $customer_id =  1 + int(rand(4294967295-1));
        my $object_id =  1 + int(rand(4294967295-1));
        my $type_id = 1 + int(rand(255 - 1));
        my $j=$i*2-1;
        print "$j,$customer_id,$object_id,$type_id,$txt\n";
        $j++;
        print "$j,123456789,$object_id,$type_id,$txt\n";
}

{ echo "BEGIN;";
  perl bug104421_jfg.pl |
    while read l; do echo "INSERT INTO p VALUES(${l/xxxxxxxxxx/\'xxxxxxxxxx\'});"; done |
    sed -e '1000~1000s/^/COMMIT;BEGIN;/'; } |
  bin/mysql -uroot -S /tmp/mysql_ushastry.sock --local-infile test_jfg
  

mysql> explain SELECT id, object_id, details FROM p WHERE customer_id = 123456789 ORDER BY id LIMIT 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
   partitions: NULL
         type: index
possible_keys: PRIMARY,customer_id
          key: id
      key_len: 8
          ref: NULL
         rows: 20
     filtered: 50.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

-
mysql> show status like "Handler_read%";

+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 1     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 19    |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+
7 rows in set (0.00 sec)

mysql>
.

regards,
Umesh