Bug #94068 Index performance degradation, ~50 times slower after 30 hours. reproduceable
Submitted: 25 Jan 2019 19:19 Modified: 6 Mar 2019 14:33
Reporter: John K Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:8.0.13 OS:Linux
Assigned to: CPU Architecture:Any
Tags: index virtual

[25 Jan 2019 19:19] John K
Description:
TABLE:
 CREATE TABLE `users`
(
  `id`                    int(11)  NOT NULL AUTO_INCREMENT,
  `first_name`            varchar(64) COLLATE utf8_bin DEFAULT NULL,
  `last_name`             varchar(64) COLLATE utf8_bin DEFAULT NULL,
  `d_table_id`            int(11)                      DEFAULT NULL,
  `last_code`             smallint(6)                  DEFAULT NULL,
  `date_created`          datetime NOT NULL,
  `date_assigned`         datetime                     DEFAULT NULL,
  `date_finished`         datetime                     DEFAULT NULL,
  `virt_ready_for_assign` tinyint(1) GENERATED ALWAYS AS ((isnull(`date_assigned`) and isnull(`date_finished`) and (`d_table_id` > 0))) VIRTUAL,
  PRIMARY KEY (`id`),
  KEY `date_assigned` (`date_assigned`),
  KEY `d_table_id` (`d_table_id`),
  KEY `d_table_id_date_assigned` (`d_table_id`, `date_assigned`),
  KEY `date_assigned_date_finished` (`date_assigned`, `date_finished`),
  KEY `virt_ready_for_assign` (`virt_ready_for_assign` DESC)
) ENGINE = InnoDB
  AUTO_INCREMENT = 76124573
  DEFAULT CHARSET = utf8
  COLLATE = utf8_bin

The problem here is "virt_ready_for_assign", it's a virtual column indexed which is supposed to make a "SELECT id FROM table FOR UPDATE" very fast.  

Only few rows (0-10000) will have virt_ready_for_assign=1 at any time, all others are 0 (a few are null).  

This works perfectly after creating the index and within the first half million INSERTs/UPDATEs after this the server starts to choke up on the query, it takes 10-20 times the time.

Here is the explain when the problem happens:
"explain select count(*) from users where virt_ready_for_assign=1"

id  select_type  table              partitions  type  possible_keys          key                    key_len  ref    rows    filtered  Extra
1   SIMPLE       users  NULL        ref   virt_ready_for_assign  virt_ready_for_assign  2        const  696608  100.00    Using index

The index is updated (the number 696608 is changing all the time) but it's incorrect by magnitudes.  
ANALYZE TABLE does not help to fix that number (or the performance loss)  

Here is another EXPLAIN:
 {
  "query_block": {
  "select_id": 1,
  "cost_info": {
  "query_cost": "77786.30"
  },
  "table": {
  "table_name": "users",
  "access_type": "ref",
  "possible_keys": [
  "virt_ready_for_assign"
  ],
  "key": "virt_ready_for_assign",
  "used_key_parts": [
  "virt_ready_for_assign"
  ],
  "key_length": "2",
  "ref": [
  "const"
  ],
  "rows_examined_per_scan": 772200,
  "rows_produced_per_join": 772200,
  "filtered": "100.00",
  "using_index": true,
  "cost_info": {
  "read_cost": "566.30",
  "eval_cost": "77220.00",
  "prefix_cost": "77786.30",
  "data_read_per_join": "1G"
  },
  "used_columns": [
  "d_table_id",
  "date_assigned",
  "date_finished",
  "virt_ready_for_assign"
  ]
  }
  }
  }

The problem appears every few hours after a few hundred thousand inserts and gets so severe that the database stalls on it.  
That's understandable if it searches half a million rows all the time.  

To make sure it's an index issue I have created a new index, a duplicate of the faulty one.
Then I used SELECT FORCE INDEX() queries and it shows a 10 time latency difference.  

I did the same yesterday (I switched sort order from normal to DESC) and it was the same situation, the database suddenly works fine again when the index is renewed.  

This is a severe problem and will cause the server to totally stall if the index is not deleted and renewed regularly.

mysql> select count(*) from users FORCE INDEX  (virt_ready_for_assign_2) where virt_ready_for_assign=1;
+----------+
| count(*) |
+----------+
|      241 |
+----------+
1 row in set (0.02 sec)

mysql> select count(*) from users FORCE INDEX  (virt_ready_for_assign) where virt_ready_for_assign=1;
+----------+
| count(*) |
+----------+
|      300 |
+----------+
1 row in set (0.14 sec)

mysql> select count(*) from users FORCE INDEX  (virt_ready_for_assign_2) where virt_ready_for_assign=1;
+----------+
| count(*) |
+----------+
|      751 |
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from users FORCE INDEX  (virt_ready_for_assign) where virt_ready_for_assign=1;
+----------+
| count(*) |
+----------+
|      745 |
+----------+
1 row in set (0.08 sec)

mysql> select count(*) from users FORCE INDEX  (virt_ready_for_assign_2) where virt_ready_for_assign=1;
+----------+
| count(*) |
+----------+
|       42 |
+----------+
1 row in set (0.01 sec)

How to repeat:
The table is populated with about 100,000 rows per hour.

1. INSERT with: date_assigned = NULL, date_finished = NULL d_table_id =0
-- virt_ready_for_assign => 0
2. UPDATE d_table_id > 0
-- virt_ready_for_assign => 1
3. UPDATE date_assigned = NOW()
-- virt_ready_for_assign => 0
4. UPDATE date_finished = NOW()
--> back to 1.

After a few hundred thousand inserts performance will degrade more and more.
select count(*) from users where virt_ready_for_assign=1
select id from users where virt_ready_for_assign=1
select id from users where virt_ready_for_assign=1 FOR UPDATE

all these queries will performe significantly worse over time.  

Suggested fix:
As indexes usually work fine I think that VIRTUAL column indexes must have a significant error somewhere which prevents them from updating.
[25 Jan 2019 20:32] John K
I would like to add:
The situation is very concurrent.  
The Updates come in in a highly parallel fashion, this might play a role with the bug experienced.  

It doesn't need to wait for hours to verify the problem, the EXPLAIN rowcount will rise with almost every insert. Only the performance degradation is not noticeable directly as the counts are all made in memory anyway.
[26 Jan 2019 0:36] John K
I've copied the table, modified the column to be STORED instead of VIRTUAL and thought this might solve the issue.  
The problem persists.

I'm not sure if it's a universal problem but from my point of view INDEX on GENERATED columns is bugged and should not be used on productive environments until patched.  

I am solving this now by creating a new index and dropping the old one every 60 minutes, this keeps the server alive.
[26 Jan 2019 4:43] John K
Oh well, situation changed.  
It's a general bug/bad behaviour and has nothing to do with the virtual column.  
I rebuild the whole table again, this time using a trigger instead of the virtual column.  
The performance degradation is the same, the "query cost" increases by throughly 10000 every hour until the server stalls to a halt as it needs 5+ seconds to do a simple select on an index containing 20 matching rows.   

I did not have this sort of issue on 5.7 that's new and I realized that most of my tables have that problem.  
I currently optimize small write-heavy tables in a cronjob which solves that problem for them but the large ones are a major pain.  

My guess is that it's a general issue with Innodb indexes in heavy-write environments, they become clustered and performance can drop by magnitudes within days of usage.

I'm changing the bug report title to match the problem.
[26 Jan 2019 17:33] John K
30 hours have passed, performance degraded like before
Original query cost was 1.1
Now it is  "query_cost": "262124.07"

In 5-6 hours the server would be dead from queries on the index if it's not re-created.
[5 Feb 2019 13:47] MySQL Verification Team
Hi,

Thank you for your bug report.

Sorry, but I do not think that this is a bug. You have   a column which can have only two values.

Hence, your EXPLAIN shows the only possible way that query can be resolved. By scanning the entire index.

When you add several hundred thousands of rows, of course that query will slow down.

You can mitigate it , by running ANALYZE command periodically, depending on the frequency of the row changes.

In short, to me this seems like the expected behaviour.
[6 Mar 2019 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".