Bug #69141 SELECT from InnoDB table hangs being run in parallel with replicating LOAD DATA
Submitted: 3 May 2013 22:56 Modified: 19 Sep 2013 18:23
Reporter: Elena Stepanova Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.6-debug, 5.7-debug OS:Linux (Ubuntu 12.04 64-bit)
Assigned to: Annamalai Gurusami CPU Architecture:Any

[3 May 2013 22:56] Elena Stepanova
Description:
The provided test case hangs on the next SELECT COUNT(*) after executing several of them reasonably fast (and returning 0). 
It stays in this state even when replicating of LOAD DATA ends:

+----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-------------------------+
| Id | User        | Host            | db   | Command | Time | State                                                                       | Info                    |
+----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-------------------------+
|  1 | root        | localhost:46708 | test | Sleep   |  708 |                                                                             | NULL                    |
|  2 | root        | localhost:46709 | test | Sleep   |  711 |                                                                             | NULL                    |
|  3 | system user |                 | NULL | Connect |  709 | Waiting for master to send event                                            | NULL                    |
|  4 | system user |                 | NULL | Connect |  705 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                    |
|  5 | root        | localhost:46715 | test | Query   |  408 | Sending data                                                                | SELECT COUNT(*) FROM t1 |
|  6 | root        | localhost:46716 | test | Sleep   |  708 |                                                                             | NULL                    |
|  7 | root        | localhost:46719 | test | Query   |    0 | init                                                                        | show processlist        |
+----+-------------+-----------------+------+---------+------+--------------------------------------------------------

Stack trace:

Thread 4 (Thread 0x7f831b18d700 (LWP 3635)):
#0  0x0000000000bee80c in dyn_block_get_used (block=0x7f82e0041848) at mysql-5.6/storage/innobase/include/dyn0dyn.ic:52
#1  0x0000000000beeacb in dyn_array_get_element (arr=0x7f831b189f38, pos=50528) at mysql-5.6/storage/innobase/include/dyn0dyn.ic:233
#2  0x0000000000bef3e4 in mtr_memo_contains (mtr=0x7f831b189f30, object=0x7f831db87600, type=2) at mysql-5.6/storage/innobase/include/mtr0mtr.ic:183
#3  0x0000000000bf0932 in mtr_memo_contains_page (mtr=0x7f831b189f30, ptr=0x7f831dcf7945 "", type=2) at mysql-5.6/storage/innobase/mtr/mtr0mtr.cc:415
#4  0x0000000000cb2a03 in row_vers_build_for_consistent_read (rec=0x7f831dcf7945 "", mtr=0x7f831b189f30, index=0x7f82ec040538, offsets=0x7f831b189e80, view=0x7f82e0035168, offset_heap=0x7f831b189e78, in_heap=0x7f82e00393f0, old_vers=0x7f831b189e88) at mysql-5.6/storage/innobase/row/row0vers.cc:507
#5  0x0000000000c98b1d in row_sel_build_prev_vers_for_mysql (read_view=0x7f82e0035168, clust_index=0x7f82ec040538, prebuilt=0x7f82e0038178, rec=0x7f831dcf7945 "", offsets=0x7f831b189e80, offset_heap=0x7f831b189e78, old_vers=0x7f831b189e88, mtr=0x7f831b189f30) at mysql-5.6/storage/innobase/row/row0sel.cc:3000
#6  0x0000000000c9bb76 in row_search_for_mysql (buf=0x7f82e0037d20 "\377", ' ' <repeats 128 times>, mode=1, prebuilt=0x7f82e0038178, match_mode=0, direction=0) at mysql-5.6/storage/innobase/row/row0sel.cc:4555
#7  0x0000000000b7c965 in ha_innobase::index_read (this=0x7f82e0036200, buf=0x7f82e0037d20 "\377", ' ' <repeats 128 times>, key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at mysql-5.6/storage/innobase/handler/ha_innodb.cc:7457
#8  0x0000000000b7d670 in ha_innobase::index_first (this=0x7f82e0036200, buf=0x7f82e0037d20 "\377", ' ' <repeats 128 times>) at mysql-5.6/storage/innobase/handler/ha_innodb.cc:7824
#9  0x0000000000b7d858 in ha_innobase::rnd_next (this=0x7f82e0036200, buf=0x7f82e0037d20 "\377", ' ' <repeats 128 times>) at mysql-5.6/storage/innobase/handler/ha_innodb.cc:7921
#10 0x0000000000637384 in handler::ha_rnd_next (this=0x7f82e0036200, buf=0x7f82e0037d20 "\377", ' ' <repeats 128 times>) at mysql-5.6/sql/handler.cc:2621
#11 0x000000000097733e in rr_sequential (info=0x7f82e0006308) at mysql-5.6/sql/records.cc:478
#12 0x00000000007a0541 in join_init_read_record (tab=0x7f82e0006278) at mysql-5.6/sql/sql_executor.cc:2392
#13 0x000000000079dd55 in sub_select (join=0x7f82e00055b8, join_tab=0x7f82e0006278, end_of_records=false) at mysql-5.6/sql/sql_executor.cc:1253
#14 0x000000000079d7c5 in do_select (join=0x7f82e00055b8) at mysql-5.6/sql/sql_executor.cc:930
#15 0x000000000079b7e2 in JOIN::exec (this=0x7f82e00055b8) at mysql-5.6/sql/sql_executor.cc:191
#16 0x00000000007fac7e in mysql_execute_select (thd=0x2886510, select_lex=0x2888dc8, free_join=true) at mysql-5.6/sql/sql_select.cc:1100
#17 0x00000000007faf7b in mysql_select (thd=0x2886510, tables=0x7f82e0004f78, wild_num=0, fields=..., conds=0x0, order=0x2888f90, group=0x2888ec8, having=0x0, select_options=2147748608, result=0x7f82e0005590, unit=0x2888780, select_lex=0x2888dc8) at mysql-5.6/sql/sql_select.cc:1221
#18 0x00000000007f8fd2 in handle_select (thd=0x2886510, result=0x7f82e0005590, setup_tables_done_option=0) at mysql-5.6/sql/sql_select.cc:110
#19 0x00000000007d3d51 in execute_sqlcom_select (thd=0x2886510, all_tables=0x7f82e0004f78) at mysql-5.6/sql/sql_parse.cc:5021
#20 0x00000000007cc997 in mysql_execute_command (thd=0x2886510) at mysql-5.6/sql/sql_parse.cc:2591
#21 0x00000000007d62d7 in mysql_parse (thd=0x2886510, rawbuf=0x7f82e0004c90 "SELECT COUNT(*) FROM t1", length=23, parser_state=0x7f831b18c100) at mysql-5.6/sql/sql_parse.cc:6161
#22 0x00000000007c9b64 in dispatch_command (command=COM_QUERY, thd=0x2886510, packet=0x2889c21 "", packet_length=23) at mysql-5.6/sql/sql_parse.cc:1336
#23 0x00000000007c8c6c in do_command (thd=0x2886510) at mysql-5.6/sql/sql_parse.cc:1038
#24 0x0000000000791153 in do_handle_one_connection (thd_arg=0x2886510) at mysql-5.6/sql/sql_connect.cc:977
#25 0x0000000000790b57 in handle_one_connection (arg=0x2886510) at mysql-5.6/sql/sql_connect.cc:893
#26 0x0000000000ad7306 in pfs_spawn_thread (arg=0x27b3b70) at mysql-5.6/storage/perfschema/pfs.cc:1853
#27 0x00007f8327ab6e9a in start_thread (arg=0x7f831b18d700) at pthread_create.c:308
#28 0x00007f8326fd1cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#29 0x0000000000000000 in ?? ()

How to repeat:
--source include/master-slave.inc
--source include/have_innodb.inc

--connection master

--perl
open (LOAD, ">$ENV{MYSQLTEST_VARDIR}/1.load") || die "Could not create load file\n";
foreach (1..1000000) {
	print LOAD "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n";
}
close (LOAD);
EOF

CREATE TABLE t1 (c CHAR(128)) ENGINE=InnoDB;
eval LOAD DATA LOCAL INFILE '$MYSQLTEST_VARDIR/1.load' INTO TABLE t1;

--connection slave

--let $run = 100
while ($run)
{
	SELECT COUNT(*) FROM t1;
	sleep 1;
	dec $run;
}
[4 May 2013 6:21] Shane Bester
Setting to verified.
The stack trace looks like http://bugs.mysql.com/bug.php?id=68069 !

In my test, the load data on master took 1min 20s.   Issuing a normal select count(*) takes ~16 seconds.  But when hitting this bug on slave, it still hangs, long after the load data finished as we see:

| Command | Time | State                                                                       | Info
+---------+------+-----------------------------------------------------------------------------+-------------------------
| Connect | 1050 | Waiting for master to send event                                            | NULL
| Connect |  912 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL
| Query   |  820 | Sending data                                                                | select count(*) from t1
| Query   |    0 | init                                                                        | show processlist
+---------+------+-----------------------------------------------------------------------------+-------------------------
[4 May 2013 6:21] Shane Bester
5.6.11-debug stack

Attachment: bug69141_stack.txt (text/plain), 4.55 KiB.

[4 May 2013 7:17] Shane Bester
alas, it is still running:

Connect | 4322 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL
Query   | 4230 | Sending data                                                                | select count(*) from t1
Query   |    0 | init                                                                        | show processlist
[18 Sep 2013 19:34] Bugs System
Noted in 5.6.15, 5.7.3 changelog.

A "SELECT COUNT (*)" query would run very slowly when run concurrently
with a "LOAD DATA" operation.
[19 Sep 2013 18:23] Elena Stepanova
Hi, just wondering -- does "very slowly" mean that somebody actually saw it finish, or is it just a nice alias for hanging?

(It might be in private comments, but mind you, we don't see those).
[20 Sep 2013 4:14] Annamalai Gurusami
Hi Elena,

I am the one who worked on this bug.  Yes, the query was not hanging.  It completed in 1 hour 16 min 36.10 sec!  The root cause was a O(n2) loop in mtr_memo_contains(), which is now changed to O(n) loop.

Rgds,
anna
[4 Dec 2013 11:35] Laurynas Biveinis
5.6$ bzr log -r 5457
------------------------------------------------------------
revno: 5457
committer: Annamalai Gurusami <annamalai.gurusami@oracle.com>
branch nick: mysql-5.6
timestamp: Tue 2013-09-17 17:23:58 +0530
message:
  Bug #16764240 SELECT FROM INNODB TABLE HANGS BEING RUN IN PARALLEL
  WITH REPLICATING LOAD DATA
  
  Problem:
  
  The mtr_memo_contains() function determines if the given object is part
  of the memo of the mini transaction.  In the test scenario, this search
  is very slow (takes more than 1 hour).   This is because the search in
  mtr_memo_contains() involves a nested loop.
  
  Solution:
  
  Re-write mtr_memo_contains() so that the search involves only one loop.
  This is already done by Yasufumi as part of wl#6363.  Backporting the
  functions mtr_memo_contains() and mtr_memo_note_modifications() from
  mysql-5.7 to mysql-5.6 is sufficient. 
  
  rb#3342 approved by Yasufumi
[16 Dec 2013 15:45] Daniel Price
Bug text has been revised for 5.6.15, 5.7.3 changelog entries:

"A "SELECT COUNT(*)" query would take a long time to complete when run
concurrently with a "LOAD DATA" operation. The "mtr_memo_contains"
function, which determines if an object is part of a memo in a mini
transaction, contained a nested loop that caused the query to run slowly."