Description:
When an ndbapi client is wating for more scan results (by calling
NdbScanOperation::nextResult()), the calling thread will sometimes wake up even
if no new batches for any fragment have arrived. NdbScanOperation::nextResultNdbRecord() will correctly detect this and call
PollGuard::wait_scan() to wait longer. But this is still an unneeded context
switch, which may lower performance.
This happens if a transaction has several active scans at the same time. For
example, the mysqld will execute a nested loop join by reading the first
tuple from scan 1, then submit scan 2, read the first row from scan 2, issue
scan 3 etc.
How to repeat:
Apply the patch below (to e.g. bk-internal.mysql.com/bzrroot/server/mysql-5.1-telco-7.0 revno 3467):
=== modified file 'storage/ndb/src/ndbapi/NdbScanOperation.cpp'
--- storage/ndb/src/ndbapi/NdbScanOperation.cpp 2010-01-28 15:16:46 +0000
+++ storage/ndb/src/ndbapi/NdbScanOperation.cpp 2010-03-23 08:43:11 +0000
@@ -1880,9 +1880,11 @@ NdbScanOperation::nextResultNdbRecord(co
}
else if (retVal == 2 && sent > 0)
{
+ Uint32 old_sent = m_sent_receivers_count;
/* No completed... */
int ret_code= poll_guard.wait_scan(3*timeout, nodeId, forceSend);
if (ret_code == 0 && seq == tp->getNodeSequence(nodeId)) {
+ assert(m_sent_receivers_count < old_sent);
continue;
} else if(ret_code == -1){
retVal= -1;
---------------------------------------------------------------------
This patch cause an assert failure if the client thread wakes up when there
are no new results (or end of fragments).
Compile the code and start a cluster with at least two data nodes.
Then run the attached spur.sql script. This should give a stack trace similar
to the one seen below.
#0 0x00002b4c6502e3ec in pthread_kill () from /lib/libpthread.so.0
#1 0x0000000000b9463e in my_write_core (sig=6) at stacktrace.c:332
#2 0x00000000006e469e in handle_segfault (sig=6) at mysqld.cc:2614
#3 <signal handler called>
#4 0x00002b4c65f6d4b5 in raise () from /lib/libc.so.6
#5 0x00002b4c65f70f50 in abort () from /lib/libc.so.6
#6 0x00002b4c65f66481 in __assert_fail () from /lib/libc.so.6
#7 0x0000000000b32e44 in NdbScanOperation::nextResultNdbRecord (this=0x30308c0, out_row=@0x2fec428, fetchAllowed=true, forceSend=true) at NdbScanOperation.cpp:1887
#8 0x0000000000b32902 in NdbScanOperation::nextResult (this=0x30308c0, out_row_ptr=0x2fec428, fetchAllowed=true, forceSend=true) at NdbScanOperation.cpp:1761
#9 0x000000000096831c in ha_ndbcluster::fetch_next(NdbScanOperation*) ()
#10 0x0000000000968579 in ha_ndbcluster::next_result(unsigned char*) ()
#11 0x000000000093f989 in ha_ndbcluster::ordered_index_scan (this=0x2fea9c8, start_key=0x2b4c69770a30, end_key=0x0, sorted=false, descending=false, buf=0x3007158 "\377\016", part_spec=0x0) at ha_ndbcluster.cc:3049
#12 0x000000000094589f in ha_ndbcluster::read_range_first_to_buf (this=0x2fea9c8, start_key=0x2b4c69770a30, end_key=0x0, desc=false, sorted=false, buf=0x3007158 "\377\016") at ha_ndbcluster.cc:4922
#13 0x0000000000944e44 in ha_ndbcluster::index_read (this=0x2fea9c8, buf=0x3007158 "\377\016", key=0x3009eb8 "'", key_len=4, find_flag=HA_READ_KEY_EXACT) at ha_ndbcluster.cc:4783
#14 0x000000000084aea8 in handler::index_read_map (this=0x2fea9c8, buf=0x3007158 "\377\016", key=0x3009eb8 "'", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at handler.h:1464
#15 0x0000000000785f3a in join_read_always_key (tab=0x301ae10) at sql_select.cc:11860
#16 0x0000000000784831 in sub_select (join=0x3015a38, join_tab=0x301ae10, end_of_records=false) at sql_select.cc:11262
#17 0x0000000000784bd4 in evaluate_join_record (join=0x3015a38, join_tab=0x301abb8, error=0) at sql_select.cc:11389
#18 0x0000000000784879 in sub_select (join=0x3015a38, join_tab=0x301abb8, end_of_records=false) at sql_select.cc:11269
#19 0x0000000000784383 in do_select (join=0x3015a38, fields=0x30170a8, table=0x0, procedure=0x0) at sql_select.cc:11019
#20 0x000000000076d44d in JOIN::exec (this=0x3015a38) at sql_select.cc:2273
#21 0x000000000076dbd8 in mysql_select (thd=0x300e6a8, rref_pointer_array=0x3010718, tables=0x30081f0, wild_num=0, fields=..., conds=0x30096b8, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x3009878, unit=0x3010120, select_lex=0x3010548) at sql_select.cc:2462
#22 0x0000000000765cee in handle_select (thd=0x300e6a8, lex=0x3010080, result=0x3009878, setup_tables_done_option=0) at sql_select.cc:272
#23 0x0000000000700b86 in execute_sqlcom_select (thd=0x300e6a8, all_tables=0x30081f0) at sql_parse.cc:5058
#24 0x00000000006f7b50 in mysql_execute_command (thd=0x300e6a8) at sql_parse.cc:2251
#25 0x0000000000703050 in mysql_parse (thd=0x300e6a8, inBuf=0x3007ed8 "select count(*) from T1 t1, T1 t2, T1 t3, T1 t4\nwhere t1.b = t2.a and t3.b = t4.a", length=81, found_semicolon=0x2b4c69772bb0) at sql_parse.cc:5977
#26 0x00000000006f53c5 in dispatch_command (command=COM_QUERY, thd=0x300e6a8, packet=0x3031a39 "", packet_length=81) at sql_parse.cc:1236
#27 0x00000000006f43a1 in do_command (thd=0x300e6a8) at sql_parse.cc:877
#28 0x00000000006f26a3 in handle_one_connection (arg=0x300e6a8) at sql_connect.cc:1133
#29 0x00002b4c65028a04 in start_thread () from /lib/libpthread.so.0
#30 0x00002b4c6601980d in clone () from /lib/libc.so.6
#31 0x0000000000000000 in ?? ()
Suggested fix:
The transaction should know which scan operation the client thread is waiting for, and only wake up the thread when a new batch (or end of data) arrives
for that operation.