Bug #52298 Thread waiting for scan results wakes up unecessarily
Submitted: 23 Mar 2010 9:45 Modified: 9 Feb 2011 20:40
Reporter: Jan Wedvik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S5 (Performance)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: mysql-5.1-telco-7.0 / rev 3467

[23 Mar 2010 9:45] Jan Wedvik
Description:
When an ndbapi client is wating for more scan results (by calling 
NdbScanOperation::nextResult()), the calling thread will sometimes wake up even
if no new batches for any fragment have arrived. NdbScanOperation::nextResultNdbRecord() will correctly detect this and call 
PollGuard::wait_scan() to wait longer. But this is still an unneeded context
switch, which may lower performance.

This happens if a transaction has several active scans at the same time. For
example, the mysqld will execute a nested loop join by reading the first
tuple from scan 1, then submit scan 2, read the first row from scan 2, issue
scan 3 etc.

How to repeat:
Apply the patch below (to e.g. bk-internal.mysql.com/bzrroot/server/mysql-5.1-telco-7.0 revno 3467):

=== modified file 'storage/ndb/src/ndbapi/NdbScanOperation.cpp'
--- storage/ndb/src/ndbapi/NdbScanOperation.cpp 2010-01-28 15:16:46 +0000
+++ storage/ndb/src/ndbapi/NdbScanOperation.cpp 2010-03-23 08:43:11 +0000
@@ -1880,9 +1880,11 @@ NdbScanOperation::nextResultNdbRecord(co
       }
       else if (retVal == 2 && sent > 0)
       {
+        Uint32 old_sent = m_sent_receivers_count;
         /* No completed... */
         int ret_code= poll_guard.wait_scan(3*timeout, nodeId, forceSend);
         if (ret_code == 0 && seq == tp->getNodeSequence(nodeId)) {
+          assert(m_sent_receivers_count < old_sent);
           continue;
         } else if(ret_code == -1){
           retVal= -1;
---------------------------------------------------------------------

This patch cause an assert failure if the client thread wakes up when there 
are no new results (or end of fragments).

Compile the code and start a cluster with at least two data nodes.

Then run the attached spur.sql script. This should give a stack trace similar
to the one seen below.

#0  0x00002b4c6502e3ec in pthread_kill () from /lib/libpthread.so.0
#1  0x0000000000b9463e in my_write_core (sig=6) at stacktrace.c:332
#2  0x00000000006e469e in handle_segfault (sig=6) at mysqld.cc:2614
#3  <signal handler called>
#4  0x00002b4c65f6d4b5 in raise () from /lib/libc.so.6
#5  0x00002b4c65f70f50 in abort () from /lib/libc.so.6
#6  0x00002b4c65f66481 in __assert_fail () from /lib/libc.so.6
#7  0x0000000000b32e44 in NdbScanOperation::nextResultNdbRecord (this=0x30308c0, out_row=@0x2fec428, fetchAllowed=true, forceSend=true) at NdbScanOperation.cpp:1887
#8  0x0000000000b32902 in NdbScanOperation::nextResult (this=0x30308c0, out_row_ptr=0x2fec428, fetchAllowed=true, forceSend=true) at NdbScanOperation.cpp:1761
#9  0x000000000096831c in ha_ndbcluster::fetch_next(NdbScanOperation*) ()
#10 0x0000000000968579 in ha_ndbcluster::next_result(unsigned char*) ()
#11 0x000000000093f989 in ha_ndbcluster::ordered_index_scan (this=0x2fea9c8, start_key=0x2b4c69770a30, end_key=0x0, sorted=false, descending=false, buf=0x3007158 "\377\016", part_spec=0x0) at ha_ndbcluster.cc:3049
#12 0x000000000094589f in ha_ndbcluster::read_range_first_to_buf (this=0x2fea9c8, start_key=0x2b4c69770a30, end_key=0x0, desc=false, sorted=false, buf=0x3007158 "\377\016") at ha_ndbcluster.cc:4922
#13 0x0000000000944e44 in ha_ndbcluster::index_read (this=0x2fea9c8, buf=0x3007158 "\377\016", key=0x3009eb8 "'", key_len=4, find_flag=HA_READ_KEY_EXACT) at ha_ndbcluster.cc:4783
#14 0x000000000084aea8 in handler::index_read_map (this=0x2fea9c8, buf=0x3007158 "\377\016", key=0x3009eb8 "'", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at handler.h:1464
#15 0x0000000000785f3a in join_read_always_key (tab=0x301ae10) at sql_select.cc:11860
#16 0x0000000000784831 in sub_select (join=0x3015a38, join_tab=0x301ae10, end_of_records=false) at sql_select.cc:11262
#17 0x0000000000784bd4 in evaluate_join_record (join=0x3015a38, join_tab=0x301abb8, error=0) at sql_select.cc:11389
#18 0x0000000000784879 in sub_select (join=0x3015a38, join_tab=0x301abb8, end_of_records=false) at sql_select.cc:11269
#19 0x0000000000784383 in do_select (join=0x3015a38, fields=0x30170a8, table=0x0, procedure=0x0) at sql_select.cc:11019
#20 0x000000000076d44d in JOIN::exec (this=0x3015a38) at sql_select.cc:2273
#21 0x000000000076dbd8 in mysql_select (thd=0x300e6a8, rref_pointer_array=0x3010718, tables=0x30081f0, wild_num=0, fields=..., conds=0x30096b8, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x3009878, unit=0x3010120, select_lex=0x3010548) at sql_select.cc:2462
#22 0x0000000000765cee in handle_select (thd=0x300e6a8, lex=0x3010080, result=0x3009878, setup_tables_done_option=0) at sql_select.cc:272
#23 0x0000000000700b86 in execute_sqlcom_select (thd=0x300e6a8, all_tables=0x30081f0) at sql_parse.cc:5058
#24 0x00000000006f7b50 in mysql_execute_command (thd=0x300e6a8) at sql_parse.cc:2251
#25 0x0000000000703050 in mysql_parse (thd=0x300e6a8, inBuf=0x3007ed8 "select count(*) from T1 t1, T1 t2, T1 t3, T1 t4\nwhere t1.b = t2.a and t3.b = t4.a", length=81, found_semicolon=0x2b4c69772bb0) at sql_parse.cc:5977
#26 0x00000000006f53c5 in dispatch_command (command=COM_QUERY, thd=0x300e6a8, packet=0x3031a39 "", packet_length=81) at sql_parse.cc:1236
#27 0x00000000006f43a1 in do_command (thd=0x300e6a8) at sql_parse.cc:877
#28 0x00000000006f26a3 in handle_one_connection (arg=0x300e6a8) at sql_connect.cc:1133
#29 0x00002b4c65028a04 in start_thread () from /lib/libpthread.so.0
#30 0x00002b4c6601980d in clone () from /lib/libc.so.6
#31 0x0000000000000000 in ?? ()

Suggested fix:
The transaction should know which scan operation the client thread is waiting for, and only wake up the thread when a new batch (or end of data) arrives
for that operation.
[23 Mar 2010 9:46] Jan Wedvik
ndb_error_reporter output

Attachment: ndb_error_report_20100323100231.tar.bz2 (application/x-bzip, text), 20.42 KiB.

[23 Mar 2010 9:47] Jan Wedvik
SQL script triggering the assert

Attachment: spur.sql (text/x-sql), 3.62 KiB.

[9 Feb 2011 9:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130812

4185 Jonas Oreland	2011-02-09
      ndb - bug#52298 - remove extra unneccesary (but harmless) wakeups for scan
[9 Feb 2011 9:52] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.22 (revid:jonas@mysql.com-20110209094634-32ueik78uuqqngb2) (version source revid:jonas@mysql.com-20110209094634-32ueik78uuqqngb2) (merge vers: 5.1.51-ndb-7.0.22) (pib:24)
[9 Feb 2011 10:34] Jonas Oreland
pushed to 7.0.22 and 7.1.11
[9 Feb 2011 20:40] Jon Stephens
Documented as follows in the NDB-7.0.22 and 7.1.11 changelogs:

      When an NDBAPI client application was waiting for more scan 
      results after calling NdbScanOperation::nextResult(), the calling 
      thread sometimes woke up even if no new batches for any fragment 
      had arrived, which was unnecessary, and could have a negative 
      impact on the application's performance.

Closed.