Bug #113812 | alter table tablename engine=innodb cause data loss | ||
---|---|---|---|
Submitted: | 31 Jan 2:06 | Modified: | 5 Feb 9:12 |
Reporter: | chengkang Liao | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: DDL | Severity: | S2 (Serious) |
Version: | 8.0.33, 8.0.36 | OS: | Linux |
Assigned to: | CPU Architecture: | x86 | |
Tags: | 8.0.33, ALTER TABLE tablename engine=innodb |
[31 Jan 2:06]
chengkang Liao
[31 Jan 2:09]
chengkang Liao
my.cnf and Numbering tool
Attachment: mysql_random_data_load_0.1.12_Linux_x86_64.tar.gz (application/x-gzip, text), 3.18 MiB.
[31 Jan 5:17]
LeYuan Zhong
I have repeatted the bug . my test log as follow: count before delete data +------+ | 2192 | +------+ delete data... count after delete table 1209 alter table... count after alter table 1208 --wrong
[31 Jan 9:43]
chengkang Liao
Test 8.0.26 found no data loss. Test 8.0.27 is missing. Check the update of change log about innodb_ddl_buffer_size. Adjusting the time when innodb_ddl_buffer_size appears to be 5M will delay the occurrence of lost data.
[5 Feb 9:12]
MySQL Verification Team
Hello chengkang Liao, Thank you for the report and test case. Verified as described. regards, Umesh
[25 Apr 17:59]
LeYuan Zhong
When user call "alter table tablename engine=innodb", MySQL will rebuild the table online. MySQL scan the original table, add record to a key_buffer. when key_buffer is full, page latch will be release and use PCursor to store cursor. PCursor::savepoint() call m_pcur->move_to_prev_on_page() before store position to aviod storing "supremum". PCursor::resume() call m_pcur->move_to_next_on_page() after restore position. PCursor::resume() use restore_position() to restore from persistent cursor. When record store in persistent cursor has been purge, restore_position() will set cursor to a previous record. On this situation, PCursor::resume() will call page_cur_move_to_next() to move cursor to its right position. So, PCursor::resume() may move cursor to next record twice and it cause cursor skip a record. Here is a example: (.) a (X) b (.) c <--cursor (.) d (.) e 1. Cursor point to record 'c' and copy 'c' to builder and builder try to add it to key_buffer, but key_buffer is full. Start bulk insert. 2. To release page latch, PCursor::savepoint() will be called. Cursor move to record 'b' and store 'b' to persistent cursor. 3. During bulk insert, purge thread remove record 'b'. 4. Bulk insert is done, go on record scan. PCursor::resume() will be called. 5. First, PCursor::resume() calls restore_position(). restore_position() makes cursor restore to record 'a', then move to record 'c'. 6. Second, PCursor::resume() calls m_pcur->move_to_next_on_page(). Cursor move to recode 'd'. 7. Builder contiunes to add record 'c' to key_buffer. after that, cursor move to next record, which is record 'e'. 8. Record 'd' is skipped
[25 Apr 18:14]
LeYuan Zhong
PCursor::savepoint() use m_pcur->store_position() to store cursor. m_pcur->store_position() will store previous record when cursor pointing to "supremum". so, it is no need to call m_pcur->move_to_prev_on_page() in m_pcur->move_to_prev_on_page(). Thus, calling m_pcur->move_to_next_on_page() in PCursor::resume() is unnecessary too.
[25 Apr 18:24]
LeYuan Zhong
When record store in persistent cursor has been purge, restore_position() will set cursor to a previous record. On this situation, PCursor::resume() will call page_cur_move_to_next() to move cursor to its right position. "PCursor::resume() will call" should be "restore_position() will call"
[25 Apr 18:41]
LeYuan Zhong
Fix bug #113812 alter table tablename engine=innodb cause data loss (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: fix_bug_#113812.txt (text/plain), 796 bytes.
[26 Apr 19:27]
OCA Admin
Contribution submitted via Github - Fix bug #113812 alter table tablename engine=innodb cause data loss (*) Contribution by LeYuan Zhong (Github dgzhongly, mysql-server/pull/540#issuecomment-2078405060): I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: git_patch_1841482866.txt (text/plain), 3.14 KiB.
[9 May 12:40]
LeYuan Zhong
How to reproduce: *In order to increase the peace in which this issue happens. Modify a/storage/innobase/ddl/ddl0builder.cc diff --git a/storage/innobase/ddl/ddl0builder.cc b/storage/innobase/ddl/ddl0builder.cc index 59edd0df340..b4d35703c63 100644 --- a/storage/innobase/ddl/ddl0builder.cc +++ b/storage/innobase/ddl/ddl0builder.cc @@ -1488,6 +1488,8 @@ dberr_t Builder::bulk_add_row(Cursor &cursor, Row &row, size_t thread_id, err = insert_direct(cursor, thread_id); + std::this_thread::sleep_for(std::chrono::seconds(3)); + key_buffer->clear(); if (err != DB_SUCCESS) { diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc index ad616805904..6c12c8a15ef 100644 --- a/storage/innobase/srv/srv0srv.cc +++ b/storage/innobase/srv/srv0srv.cc @@ -3117,6 +3117,8 @@ void srv_purge_coordinator_thread() { srv_purge_coordinator_suspend(slot, rseg_history_len); } + std::this_thread::sleep_for(std::chrono::seconds(1)); + if (srv_purge_should_exit(n_total_purged)) { ut_a(!slot->suspended); break; # create a table CREATE TABLE `test` ( `id` int NOT NULL AUTO_INCREMENT, `c1` varchar(1000) DEFAULT NULL, PRIMARY KEY (`id`) ); # create a procedure delimiter $$ CREATE PROCEDURE load_data(IN n int) BEGIN DECLARE i INT DEFAULT 1; WHILE (i <= n) DO insert into test (c1) values(repeat('a',1000)); SET i = i + 1; END WHILE; END $$ delimiter ; # insert data call load_data(1000); Reproduce: #make sure innodb_ddl_buffer_size is defautl value set session innodb_ddl_buffer_size=1048576; #excute quickly select count(*) from test; delete from test where id=502; select count(*) from test; alter table test engine=innodb; select count(*) from test;
[13 Jul 12:28]
LeYuan Zhong
hello. I can still reproduce the problem on MySQL 9.0.0 When will it be fixed?