Bug #100966 select count(*) works very slow for uncommitted data
Submitted: 27 Sep 2020 3:46 Modified: 16 Nov 2020 15:35
Reporter: Brian Yue (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:MySQL8.0.18, 8.0.21 OS:Any (rhel-7.4)
Assigned to: CPU Architecture:Any (intel x86)
Tags: contributions

[27 Sep 2020 3:46] Brian Yue
Description:
Hello, dear Verification Team:
  I found that, there are about several times waste of IO in select count(*) when data are uncommitted.
  For committed data( table t1 with 10,000,000 records), `select count(*) from t1` takes 6 seconds, but for uncommitted data, it takes 3 minutes.
  Please reference to `How to repeat` for detail.

How to repeat:
(1) create a table, begin a transaction, and insert many records into the table, donot commit the `insert` command.
 
[session 1]
make a big table which has giant number of records, and, keep the insert command uncommitted.
(any method, here I use `insert into select`).

mysql> create table user_card_info_hash_3 like user_card_info_hash_2;
Query OK, 0 rows affected (0.23 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into user_card_info_hash_3 select * from user_card_info_hash_2 limit 10000000;
Query OK, 10000000 rows affected (9 min 33.99 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

# here, donot commit the `insert` command

(2) start another session, execute `select count(*)` for the table

[session 2]
mysql> use test_sj
Database changed

mysql> select count(*) from user_card_info_hash_2;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (3 min 31.31 sec)

(3) commit the `insert command`, and execute `select count(*)` for the table again.

[session 1]
mysql> commit;

[session 2]
mysql> select count(*) from user_card_info_hash_2;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (6.98 sec)

we can see that `select count(*)` finishes very soon after commit data;
and `select count(*)` shows huge difference in performance between uncommitted data and commited data.

Note: the more records, the more obvious.

Suggested fix:
  In method `Parallel_reader::Ctx::travers`, we first get a visible version of data, and then check if endpoint of range is reached or not.
  If many records are not visible, endpoint will be passed, and scan will continue until next visible record. If all records in a table are not visible, scan will not stop until end of table index. Then, very much io recource will be wasted.

fix like this:

@@ -529,6 +535,7 @@ dberr_t Parallel_reader::Ctx::traverse() {
     }

     const rec_t *rec = page_cur_get_rec(cur);
+    const rec_t *latest_rec = rec;

     offsets = rec_get_offsets(rec, index, offsets, ULINT_UNDEFINED, &heap);

@@ -542,8 +549,10 @@ dberr_t Parallel_reader::Ctx::traverse() {

     m_block = page_cur_get_block(cur);

-    if (rec != nullptr && end_tuple != nullptr) {
-      auto ret = end_tuple->compare(rec, index, offsets);
+    /* bug fix: compare with visible record may leads to many times waste of io recource,
+    because we may miss endpoint due to invisibility */
+    if (latest_rec != nullptr && end_tuple != nullptr) {
+      auto ret = end_tuple->compare(latest_rec, index, offsets);
[27 Sep 2020 5:56] Brian Yue
By the way, considering that buffer pool may affect the result, I have changed it as a very very small value.
[28 Sep 2020 11:30] Brian Yue
some problem with previous implementation, latest like this:

@@ -535,10 +535,22 @@ dberr_t Parallel_reader::Ctx::traverse() {
     }

     const rec_t *rec = page_cur_get_rec(cur);

     offsets = rec_get_offsets(rec, index, offsets, ULINT_UNDEFINED, &heap);

+    /* bug fix: compare with visible record may leads to many times waste of io recource,
+    because we may miss endpoint due to invisibility.
+    so, we use latest version of data to compare end point instead. */
+    if (rec != nullptr && end_tuple != nullptr) {
+      auto ret = end_tuple->compare(rec, index, offsets);
+
+      /* Note: The range creation doesn't use MVCC. Therefore it's possible
+      that the range boundary entry could have been deleted. */
+      if (ret <= 0) {
+        break;
+      }
+    }
+
     auto skip = !m_scan_ctx->check_visibility(rec, offsets, heap, &mtr);
[10 Oct 2020 6:26] MySQL Verification Team
Hello Brian Yue,

Thank you for the report and supplying patch.
Please note that in order to submit contributions you must first sign the Oracle Contribution Agreement (OCA). More details are described in "Contributions" tab, please ensure to re-send the patch via that tab. Otherwise we would not be able to accept it.

regards,
Umesh
[17 Oct 2020 14:19] Brian Yue
basing on MySQL version 8.0.18

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug100966_patch.txt (text/plain), 865 bytes.

[18 Oct 2020 5:44] MySQL Verification Team
Thank you for the contribution.

regards,
Umesh
[16 Nov 2020 15:35] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.23 release, and here's the proposed changelog entry from the documentation team:

A SELECT COUNT(*) operation on a table containing uncommitted data
performed poorly due to unnecessary I/O. 

Thanks to Brian Yue for the contribution.