Bug #116195 | Undo log scan at startup can cause high downtime if large transaction was open | ||
---|---|---|---|
Submitted: | 22 Sep 2024 9:37 | Modified: | 24 Sep 2024 15:30 |
Reporter: | Shankar Iyer | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server | Severity: | S5 (Performance) |
Version: | 8.0,9.0,5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[22 Sep 2024 9:37]
Shankar Iyer
[23 Sep 2024 10:21]
MySQL Verification Team
HI Mr Iyer, Thank you for your bug report. However, we have to try to repeat the behaviour with and without a patch. Hence, send us a full SQL test case ...... including CREATE TABLE and INSERT's ...... you can use sysbench for that purpose, if necessary. Also, let us know when we should kill the daemon exactly ........ Hence, we need a fully repeatable test case.
[24 Sep 2024 0:58]
Marc Reilly
To repro: env: - r6i.4xlarge - 800G GP3 storage - CNF, not sure it really matters apart from logging for NOTE messages. ``` innodb_redo_log_capacity=10G innodb_buffer_pool_size=10G innodb-buffer-pool-load-at-startup=OFF log_error_verbosity=3 ``` generate sysbench table with 100m rows ./sysbench /usr/local/share/sysbench/oltp_write_only.lua --table-size=2000000 --mysql-db=test --tables=5 --threads=5 prepare Use 3 terminals/sessions: session 1; sysbench /usr/local/share/sysbench/oltp_update_index.lua --db-driver=mysql --mysql-user=msandbox --mysql-password=msandbox --mysql-db=test --mysql-host=localhost --mysql-socket=/home/ec2-user/mysql-server/bld/mysql-test/var/tmp/mysqld.1.sock --threads=100 --rand-type=uniform --db-ps-mode=disable --tables=2 --forced-shutdown --table-size=100000000 --report-interval=1 --time=60 run Session 2: begin; select * from test.sbtest1 limit 1; update sbtest2 set pad='jdnbckjanbckjabsckjasbcjasbcjkabadvc'; # dont commit select trx_state,trx_lock_structs,trx_lock_memory_bytes,trx_rows_locked,trx_rows_modified from information_schema.innodb_trx\G *************************** 1. row *************************** trx_state: RUNNING trx_lock_structs: 1369865 trx_lock_memory_bytes: 154787960 trx_rows_locked: 101369864 trx_rows_modified: 100000000 $ ls -lahrt 90-datadir-dirty/undo_00* -rw-r-----. 1 ec2-user ec2-user 8.9G Sep 24 00:05 90-datadir-dirty/undo_002 -rw-r-----. 1 ec2-user ec2-user 32M Sep 24 00:05 90-datadir-dirty/undo_001 Terminal 3 once session 2 completes update: sudo kill -9 `pidof mysqld` sudo sync; sudo sh -c "echo 3 > /proc/sys/vm/drop_caches" Now start mysql ====== 9.0.1 - NO PATCH 321 seconds ======= 2024-09-24T00:06:14.090823Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed! 2024-09-24T00:06:40.337898Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'. 2024-09-24T00:06:40.338603Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'. 2024-09-24T00:06:40.339695Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces. 2024-09-24T00:06:40.339739Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 229464 2024-09-24T00:06:40.839240Z 1 [Note] [MY-013776] [InnoDB] Parallel initialization of rseg complete 2024-09-24T00:06:40.839280Z 1 [Note] [MY-013777] [InnoDB] Time taken to initialize rseg using 4 thread: 499 ms. 2024-09-24T00:06:40.839308Z 1 [Note] [MY-014017] [InnoDB] Transaction ID: 196797 found for resurrecting updates 2024-09-24T00:06:40.840479Z 1 [Note] [MY-014018] [InnoDB] Identified table ID: 1070 to acquire lock 2024-09-24T00:07:10.839626Z 1 [Note] [MY-014020] [InnoDB] Records read: 13473673 - Pages read: 77882 2024-09-24T00:07:40.839740Z 1 [Note] [MY-014020] [InnoDB] Records read: 22791972 - Pages read: 131745 2024-09-24T00:08:10.839788Z 1 [Note] [MY-014020] [InnoDB] Records read: 31994707 - Pages read: 184940 2024-09-24T00:08:40.840256Z 1 [Note] [MY-014020] [InnoDB] Records read: 41131875 - Pages read: 237756 2024-09-24T00:09:10.840405Z 1 [Note] [MY-014020] [InnoDB] Records read: 50016463 - Pages read: 289112 2024-09-24T00:09:40.840682Z 1 [Note] [MY-014020] [InnoDB] Records read: 59531982 - Pages read: 344115 2024-09-24T00:10:10.840947Z 1 [Note] [MY-014020] [InnoDB] Records read: 68732295 - Pages read: 397296 2024-09-24T00:10:40.841725Z 1 [Note] [MY-014020] [InnoDB] Records read: 77966170 - Pages read: 450671 2024-09-24T00:11:10.842185Z 1 [Note] [MY-014020] [InnoDB] Records read: 86953520 - Pages read: 502621 2024-09-24T00:11:40.842579Z 1 [Note] [MY-014020] [InnoDB] Records read: 94717760 - Pages read: 547501 2024-09-24T00:12:02.574074Z 1 [Note] [MY-014021] [InnoDB] Total records resurrected: 100000000 - Total pages read: 577895 - Total tables acquired: 1 2024-09-24T00:12:02.575541Z 1 [Note] [MY-014023] [InnoDB] Resurrected 1 transactions doing updates. 2024-09-24T00:12:02.575613Z 1 [Note] [MY-013023] [InnoDB] 1 transaction(s) which must be rolled back or cleaned up in total 100000000 row operations to undo 2024-09-24T00:12:02.575623Z 1 [Note] [MY-013024] [InnoDB] Trx id counter is 229889 ====== 9.0.1 - PATCH 185 seconds ======= 2024-09-24T00:14:22.956041Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed! 2024-09-24T00:14:49.324140Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'. 2024-09-24T00:14:49.325380Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'. 2024-09-24T00:14:49.326940Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces. 2024-09-24T00:14:49.327014Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 229464 2024-09-24T00:14:49.795611Z 1 [Note] [MY-013776] [InnoDB] Parallel initialization of rseg complete 2024-09-24T00:14:49.795647Z 1 [Note] [MY-013777] [InnoDB] Time taken to initialize rseg using 4 thread: 468 ms. 2024-09-24T00:14:49.796118Z 1 [Note] [MY-014017] [InnoDB] Transaction ID: 196797 found for resurrecting updates 2024-09-24T00:14:49.797440Z 1 [Note] [MY-014018] [InnoDB] Identified table ID: 1070 to acquire lock 2024-09-24T00:15:19.796305Z 1 [Note] [MY-014020] [InnoDB] Records read: 19529538 - Pages read: 112887 2024-09-24T00:15:49.796634Z 1 [Note] [MY-014020] [InnoDB] Records read: 35073242 - Pages read: 202735 2024-09-24T00:16:19.796816Z 1 [Note] [MY-014020] [InnoDB] Records read: 50611929 - Pages read: 292554 2024-09-24T00:16:49.797064Z 1 [Note] [MY-014020] [InnoDB] Records read: 66172933 - Pages read: 382502 2024-09-24T00:17:19.797256Z 1 [Note] [MY-014020] [InnoDB] Records read: 81735148 - Pages read: 472457 2024-09-24T00:17:49.797524Z 1 [Note] [MY-014020] [InnoDB] Records read: 97294768 - Pages read: 562397 2024-09-24T00:17:54.970497Z 1 [Note] [MY-014021] [InnoDB] Total records resurrected: 100000000 - Total pages read: 577895 - Total tables acquired: 1 2024-09-24T00:17:54.971637Z 1 [Note] [MY-014023] [InnoDB] Resurrected 1 transactions doing updates. 2024-09-24T00:17:54.971691Z 1 [Note] [MY-013023] [InnoDB] 1 transaction(s) which must be rolled back or cleaned up in total 100000000 row operations to undo 2024-09-24T00:17:54.971700Z 1 [Note] [MY-013024] [InnoDB] Trx id counter is 229889
[24 Sep 2024 0:58]
Marc Reilly
====== 8.4.2 - no PATCH 346 seconds ======= 2024-09-24T00:41:29.987262Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed! 2024-09-24T00:41:30.088280Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'. 2024-09-24T00:41:30.089526Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'. 2024-09-24T00:41:30.090534Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces. 2024-09-24T00:41:30.090575Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 141309 2024-09-24T00:41:30.283803Z 1 [Note] [MY-013776] [InnoDB] Parallel initialization of rseg complete 2024-09-24T00:41:30.283837Z 1 [Note] [MY-013777] [InnoDB] Time taken to initialize rseg using 4 thread: 193 ms. 2024-09-24T00:41:30.283863Z 1 [Note] [MY-014017] [InnoDB] Transaction ID: 103761 found for resurrecting updates 2024-09-24T00:41:30.283877Z 1 [Note] [MY-014018] [InnoDB] Identified table ID: 1070 to acquire lock 2024-09-24T00:42:00.284008Z 1 [Note] [MY-014020] [InnoDB] Records read: 8129300 - Pages read: 46990 2024-09-24T00:42:30.284406Z 1 [Note] [MY-014020] [InnoDB] Records read: 17251590 - Pages read: 99720 2024-09-24T00:43:00.284578Z 1 [Note] [MY-014020] [InnoDB] Records read: 26085143 - Pages read: 150781 2024-09-24T00:43:30.284725Z 1 [Note] [MY-014020] [InnoDB] Records read: 34698467 - Pages read: 200569 2024-09-24T00:44:00.284999Z 1 [Note] [MY-014020] [InnoDB] Records read: 43218371 - Pages read: 249817 2024-09-24T00:44:30.285106Z 1 [Note] [MY-014020] [InnoDB] Records read: 51949335 - Pages read: 300285 2024-09-24T00:45:00.285527Z 1 [Note] [MY-014020] [InnoDB] Records read: 60759360 - Pages read: 351210 2024-09-24T00:45:30.285652Z 1 [Note] [MY-014020] [InnoDB] Records read: 69414377 - Pages read: 401239 2024-09-24T00:46:00.286081Z 1 [Note] [MY-014020] [InnoDB] Records read: 77817333 - Pages read: 449811 2024-09-24T00:46:30.286579Z 1 [Note] [MY-014020] [InnoDB] Records read: 86644658 - Pages read: 500836 2024-09-24T00:47:00.286772Z 1 [Note] [MY-014020] [InnoDB] Records read: 95372162 - Pages read: 551284 2024-09-24T00:47:16.761724Z 1 [Note] [MY-014021] [InnoDB] Total records resurrected: 100000000 - Total pages read: 577895 - Total tables acquired: 1 2024-09-24T00:47:16.762840Z 1 [Note] [MY-014023] [InnoDB] Resurrected 1 transactions doing updates. 2024-09-24T00:47:16.762892Z 1 [Note] [MY-013023] [InnoDB] 1 transaction(s) which must be rolled back or cleaned up in total 100000000 row operations to undo ====== 8.4.2 - PATCH 192 seconds ======= 2024-09-24T00:48:52.335807Z 1 [Note] [MY-012535] [InnoDB] Apply batch completed! 2024-09-24T00:48:52.436728Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'. 2024-09-24T00:48:52.437852Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'. 2024-09-24T00:48:52.438925Z 1 [Note] [MY-012910] [InnoDB] Opened 2 existing undo tablespaces. 2024-09-24T00:48:52.438963Z 1 [Note] [MY-011980] [InnoDB] GTID recovery trx_no: 141309 2024-09-24T00:48:52.617756Z 1 [Note] [MY-013776] [InnoDB] Parallel initialization of rseg complete 2024-09-24T00:48:52.617781Z 1 [Note] [MY-013777] [InnoDB] Time taken to initialize rseg using 4 thread: 178 ms. 2024-09-24T00:48:52.617804Z 1 [Note] [MY-014017] [InnoDB] Transaction ID: 103761 found for resurrecting updates 2024-09-24T00:48:52.617816Z 1 [Note] [MY-014018] [InnoDB] Identified table ID: 1070 to acquire lock 2024-09-24T00:49:22.618117Z 1 [Note] [MY-014020] [InnoDB] Records read: 15951495 - Pages read: 92205 2024-09-24T00:49:52.618139Z 1 [Note] [MY-014020] [InnoDB] Records read: 31514056 - Pages read: 182162 2024-09-24T00:50:22.618265Z 1 [Note] [MY-014020] [InnoDB] Records read: 47056203 - Pages read: 272001 2024-09-24T00:50:52.618943Z 1 [Note] [MY-014020] [InnoDB] Records read: 62614612 - Pages read: 361934 2024-09-24T00:51:22.620199Z 1 [Note] [MY-014020] [InnoDB] Records read: 78180460 - Pages read: 451910 2024-09-24T00:51:52.620498Z 1 [Note] [MY-014020] [InnoDB] Records read: 93744578 - Pages read: 541876 2024-09-24T00:52:04.641167Z 1 [Note] [MY-014021] [InnoDB] Total records resurrected: 100000000 - Total pages read: 577895 - Total tables acquired: 1 2024-09-24T00:52:04.642310Z 1 [Note] [MY-014023] [InnoDB] Resurrected 1 transactions doing updates. 2024-09-24T00:52:04.642360Z 1 [Note] [MY-013023] [InnoDB] 1 transaction(s) which must be rolled back or cleaned up in total 100000000 row operations to undo 2024-09-24T00:52:04.642370Z 1 [Note] [MY-013024] [InnoDB] Trx id counter is 141569
[24 Sep 2024 1:00]
Marc Reilly
Correction to sysbench gen command: generate sysbench table with 100m rows should be ./sysbench /usr/local/share/sysbench/oltp_write_only.lua --table-size=100000000 --mysql-db=test --tables=2 --threads=2 prepare
[24 Sep 2024 6:21]
Shankar Iyer
Thank you Mark! Just summarizing the storage read throughput across all logs :- My original unpatch'ed 9.0.1 --------------------------- 321734 pages x 16384 = 5.27GB in 140 seconds = 37 MB/sec My 9.01 with optimization patch ------------------------------- 321734 pages x 16384 = 5.27GB in 25 seconds = 210 MB/sec Mark - ======9.0.1 - NO PATCH 321 seconds======= ------------------------------ 577895 x 16384 = 9.4GB in 321 seconds = 29 MB/sec Mark - ====== 9.0.1 - PATCH 185 seconds ======= ------------------------------ 577895 x 16384 = 9.4GB in 185 seconds = 51 MB/sec Mark - ======8.4.2 - no PATCH 346 seconds ======= ------------------------------- 577895 x 16384 = 9.4GB in 346 seconds = 27 MB/sec Mark - ======8.4.2 - PATCH 192 seconds ======= ------------------------------- 577895 x 16384 = 9.4GB in 192 seconds = 50 MB/sec The improvement in undo log scan time and corresponding reduction in downtime will depend on :- 1. VM type & storage volume bandwidth/IOPS guaranteed by Cloud provider. 2. Is MySQL is built with native AIO? If MySQL is not built with AIO, the number of concurrent page reads will be limited by the number of background read threads.
[24 Sep 2024 12:03]
MySQL Verification Team
HI Mr. Iyer and Reilly, Sorry, we could not reproduce the problem. We followed your instructions to the last letter. We only used our official binaries, without any patches added. On our machine , we changed the configuration as you indicated, while all other settings were used as they are. On 8.0.39 we had recovery in 35 seconds, on 8.4.2 we had recovery of 33 seconds and on 9.0.1 we had a recovery of 29 seconds. We can't repeat your problem. Can't repeat.
[24 Sep 2024 13:29]
Shankar Iyer
>On 8.0.39 we had recovery in 35 seconds, on 8.4.2 we had recovery of 33 seconds and on 9.0.1 we had a recovery of 29 seconds. If this was tried with the Sysbench schema test case from the earlier comment : 9.4 GB read in ~30 seconds indicates 300 MB/s storage throughput. Can you let know if you were trying on instance with local SSD? Single threaded/single page synchronous read on a ultra-low latency, local NVME SSD on-prem could show good throughput. Again, if a outage of ~30 seconds can be reduced to 15 seconds by the patch, that will be still a win. The problem reported in this ticket is more acute in Cloud and with network attached storage. Typically latencies are as high as 500us to 2ms. Thus the single threaded/single page read through reaches on 30MB-50MB/sec.
[24 Sep 2024 13:31]
Shankar Iyer
Correction : Thus the single threaded/single page read throughput reaches only 30MB-50MB/sec and thus it will take time to read 5GB or 10GB or more.
[24 Sep 2024 14:19]
MySQL Verification Team
Hi, No, that is not the case here. What we have is InnoDB configured with 100 % ACID operation, hence there is very little work to be done after it is killed.
[24 Sep 2024 15:30]
Shankar Iyer
> What we have is InnoDB configured with 100 % ACID operation, hence there is very little work to be done after it is killed. Interesting. If there were no active, uncommitted transactions AND there were no dirty pages after the last checkpoint, MySQL will certainly have very little work to do for undo recovery and redo recovery. Can you quickly check if you saw messages like '1 transaction(s) which must be rolled back' in your log? The below commit added progress reporting for undo log scan, 1 message per 30 seconds. I am guessing that was done because customers reported MySQL spending time in recovery without indicating what is going on. The patch in this current ticket is precisely for the same situation and reduces downtime. https://github.com/mysql/mysql-server/commit/6337405f71b59ffbd7ef234ebb04ce4a641c8fbb
[24 Sep 2024 16:02]
MySQL Verification Team
Hi, Yes, we saw a couple of those messages. But, all data and logs were fully synced on the disk and undo logs were applied very fast, due to the way we configure numbers and sizes of redo and undo logs. Some of us also use separate filesystems especially for those logs. Some even use different type of the filesystem. There are many details on how to optimise the application of the undo logs, which we can not share.