Bug #40118 Crash when running Batched Key Access and requiring one match for each key
Submitted: 17 Oct 2008 17:45 Modified: 22 Nov 2010 1:30
Reporter: Sergey Petrunya Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:mysql-6.0-bka-preview OS:Any
Assigned to: Sergey Petrunya CPU Architecture:Any

[17 Oct 2008 17:45] Sergey Petrunya
Description:
A query that uses LEFT JOIN, "Not exists" optimization, and batched key access repeatably crashes the server.

How to repeat:
Run this:

create table t0(a int);
insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (a int, key(a));
insert into t1 select * from t0;
alter table t1 add b int not null, add filler char(200);

set join_cache_level=8;
explain select * from t0 left join t1 on t0.a=t1.a where t1.b is null; 

And see it crash like this:
  Program received signal SIGSEGV, Segmentation fault.
  0x083ac09a in bka_unique_range_seq_skip_record (rseq=0xa5a5a5a5, range_info=0x9685e30 "", rowid=0x9685e9e "") at sql_select.cc:19802
(gdb) wher
  #0  0x083ac09a in bka_unique_range_seq_skip_record (rseq=0xa5a5a5a5, range_info=0x9685e30 "", rowid=0x9685e9e "") at sql_select.cc:19802
  #1  0x0847174b in DsMrr_impl::dsmrr_next (this=0x968266c, h=0x9682518, range_info=0xaf64f868) at handler.cc:4522
  #2  0x08590b13 in ha_myisam::multi_range_read_next (this=0x9682518, range_info=0xaf64f868) at ha_myisam.cc:2055
  #3  0x083b7f8c in JOIN_CACHE_BKA_UNIQUE::join_matching_records (this=0x96856d0, skip_last=false) at sql_select.cc:19865
  #4  0x083aa752 in JOIN_CACHE::join_records (this=0x96856d0, skip_last=false) at sql_select.cc:18448
  #5  0x083b8bff in sub_select_cache (join=0x9683e58, join_tab=0x967c5f4, end_of_records=true) at sql_select.cc:13738
  #6  0x083b8915 in sub_select (join=0x9683e58, join_tab=0x967c450, end_of_records=true) at sql_select.cc:13897
  #7  0x083bfc05 in do_select (join=0x9683e58, fields=0xaeb44894, table=0x0, procedure=0x0) at sql_select.cc:13635
  #8  0x083dba82 in JOIN::exec (this=0x9683e58) at sql_select.cc:2834
  #9  0x083d6775 in mysql_select (thd=0xaeb432d0, rref_pointer_array=0xaeb44904, tables=0x967ab10, wild_num=1, fields=@0xaeb44894, conds=0x967b930, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x967ba18, unit=0xaeb44568, select_lex=0xaeb44800) at sql_select.cc:3024
  #10 0x083dbd92 in handle_select (thd=0xaeb432d0, lex=0xaeb4450c, result=0x967ba18, setup_tables_done_option=0) at sql_select.cc:298
  #11 0x083470b9 in execute_sqlcom_select (thd=0xaeb432d0, all_tables=0x967ab10) at sql_parse.cc:4611
  #12 0x08348118 in mysql_execute_command (thd=0xaeb432d0) at sql_parse.cc:2037
  #13 0x08350397 in mysql_parse (thd=0xaeb432d0, inBuf=0x967a988 "select * from t0 left join t1 on t0.a=t1.a where t1.b is null", length=61, found_semicolon=0xaf650e70) at sql_parse.cc:5587
  #14 0x083513a3 in dispatch_command (command=COM_QUERY, thd=0xaeb432d0, packet=0xaeb91da9 "", packet_length=61) at sql_parse.cc:1002
  #15 0x083526e6 in do_command (thd=0xaeb432d0) at sql_parse.cc:689
  #16 0x0833fa30 in handle_one_connection (arg=0xaeb432d0) at sql_connect.cc:1153
  #17 0xb7f9d18b in start_thread () from /lib/libpthread.so.0
[17 Oct 2008 19:39] Sergey Petrunya
The crash is in DS-MRR code and is fixed by this change:

=== modified file 'sql/handler.cc'
--- sql/handler.cc
+++ sql/handler.cc
@@ -4519,8 +4519,8 @@ int DsMrr_impl::dsmrr_next(handler *h, c
       cur_range_info= *(uchar**)(rowids_buf_cur + h->ref_length);
 
     rowids_buf_cur += h->ref_length + sizeof(void*) * test(is_mrr_assoc);
-    if (h->mrr_funcs.skip_record &&
-       h->mrr_funcs.skip_record(h->mrr_iter, (char *) cur_range_info, rowid))
+    if (h2->mrr_funcs.skip_record &&
+       h2->mrr_funcs.skip_record(h2->mrr_iter, (char *) cur_range_info, rowid))
       continue;
     res= h->rnd_pos(table->record[0], rowid);
     break;
[17 Oct 2008 19:45] Sergey Petrunya
However, once the crash is fixed, the query produces wrong results. Here's a piece of .result file:

+create table t0(a int);
+insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
+create table t1 (a int, b int not null, c char(100), key(a));
+insert into t1 select a,0,a from t0;
+select * from t0 left join t1 on t0.a=t1.a where t1.b is null and (c is null or length(c)>2);
+a      a       b       c
+0      0       0       0
+1      1       0       1
+2      2       0       2
+3      3       0       3
+4      4       0       4
+5      5       0       5
+6      6       0       6
+7      7       0       7
+8      8       0       8
+9      9       0       9
+set join_cache_level=default;
+select * from t0 left join t1 on t0.a=t1.a where t1.b is null and (c is null or length(c)>2);
+a      a       b       c
+drop table t0, t1;

The difference is that records are not filtered out. The condition that is attached to the second table is never checked.
[17 Oct 2008 19:54] Sergey Petrunya
Experiments in the debugger show that it is not checked because it is wrapped into a trigger which is turned off:

(gdb) wher
  #0  Item_func_trig_cond::val_int (this=0x9c4c530) at item_cmpfunc.h:416
  #1  0x083dc31e in SQL_SELECT::skip_record (this=0x9c4c420) at opt_range.h:735
  #2  0x083dd7cc in JOIN_CACHE::check_match (this=0x9c4c5b0, rec_ptr=0x9c4e111 "") at sql_select.cc:18774
  #3  0x083b7bcd in JOIN_CACHE::generate_full_extensions (this=0x9c4c5b0, rec_ptr=0x9c4e111 "") at sql_select.cc:18724
  #4  0x083b8159 in JOIN_CACHE_BKA::join_matching_records (this=0x9c4c5b0, skip_last=false) at sql_select.cc:19106
  #5  0x083aa752 in JOIN_CACHE::join_records (this=0x9c4c5b0, skip_last=false) at sql_select.cc:18448
  #6  0x083b8bff in sub_select_cache (join=0x9c4a150, join_tab=0x9c4bcd4, end_of_records=true) at sql_select.cc:13738
  #7  0x083b8915 in sub_select (join=0x9c4a150, join_tab=0x9c4bb30, end_of_records=true) at sql_select.cc:13897
  #8  0x083bfc05 in do_select (join=0x9c4a150, fields=0x9bed10c, table=0x0, procedure=0x0) at sql_select.cc:13635
  #9  0x083dba82 in JOIN::exec (this=0x9c4a150) at sql_select.cc:2834
  #10 0x083d6775 in mysql_select (thd=0x9bebb48, rref_pointer_array=0x9bed17c, tables=0x9c43258, wild_num=1, fields=@0x9bed10c, conds=0x9c44578, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x9c44690, unit=0x9becde0, select_lex=0x9bed078) at sql_select.cc:3024
  #11 0x083dbd92 in handle_select (thd=0x9bebb48, lex=0x9becd84, result=0x9c44690, setup_tables_done_option=0) at sql_select.cc:298
  #12 0x083470b9 in execute_sqlcom_select (thd=0x9bebb48, all_tables=0x9c43258) at sql_parse.cc:4611
  #13 0x08348118 in mysql_execute_command (thd=0x9bebb48) at sql_parse.cc:2037
  #14 0x08350397 in mysql_parse (thd=0x9bebb48, inBuf=0x9c43090 "select * from t0 left join t1 on t0.a=t1.a where t1.b is null and (c is null or length(c)>2)", length=92, found_semicolon=0xaa4eae70) at sql_parse.cc:5587

(gdb) p *trig_var
  $26 = false
(gdb) p this->args[0]
  $27 = (Item_cond_and *) 0x9c4c2c0
(gdb) p (Item*)(this->args[0]->list.first->info)
  $29 = (Item_func_isnull *) 0x9c44078
(gdb) p (Item*)(this->args[0]->list.first->next->info)
  $30 = (Item_cond_or *) 0x9c4c370

So the condition is there, it is just the trigger is in incorrect state.
[18 Oct 2008 16:41] Sergey Petrunya
The problem actually occurs whenever one tries to execute (EXPLAINs are not affected) a query that uses Batched Key Access and requires one match for each key.  At this moment one match is required by 
1) LEFT JOIN and "Not exists" optimization, and 2) Subuqery and FirstMatch optimization.

Here's a testcase for subqueries: 

insert into t1 select * from t1;
insert into t1 select * from t1;
mysql> explain select * from t0 where t0.a in (select t1.a from t1 where t1.b=0); 
+----+-------------+-------+------+---------------+------+---------+---------+------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra                                          |
+----+-------------+-------+------+---------------+------+---------+---------+------+------------------------------------------------+
|  1 | PRIMARY     | t0    | ALL  | NULL          | NULL | NULL    | NULL    |   10 |                                                | 
|  1 | PRIMARY     | t1    | ref  | a             | a    | 5       | j7.t0.a |    4 | Using where; FirstMatch(t0); Using join buffer | 
+----+-------------+-------+------+---------------+------+---------+---------+------+------------------------------------------------+
2 rows in set (0.01 sec)

mysql> select * from t0 where t0.a in (select t1.a from t1 where t1.b=0); 
ERROR 2013 (HY000): Lost connection to MySQL server during query
[18 Oct 2008 16:43] Sergey Petrunya
Changed synopsis to better reflect the bug nature.
[18 Oct 2008 16:44] Sergey Petrunya
The demonstrated problem of wrong query results with outer join is a separate problem which will be addressed outside the scope of this bug.
[18 Oct 2008 16:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56518

2656 Sergey Petrunia	2008-10-16
      BUG#40118: Crash when running Batched Key Access and requiring one match for each key
      - Make DsMrr_impl::dsmrr_next use h2->mrr_funcs, not h->mrr_funcs. This is because 
        DS-MRR uses h2 to scan the index (and hence h2->mrr_funcs is properly initialized)
        while h is used to make rnd_pos() calls (and so h->mrr_funcs is not defined)
[18 Oct 2008 17:19] Sergey Petrunya
Pushed into mysql-6.0-bka-preview tree.
[19 Oct 2008 11:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56539

2658 Sergey Petrunia	2008-10-16
      Fix testcase for BUG#40118
[14 Dec 2008 11:07] Bugs System
Pushed into 6.0.8-alpha  (revid:sergefp@mysql.com-20081016043109-5q83ynpcoe7nuuwl) (version source revid:sergefp@mysql.com-20081016043109-5q83ynpcoe7nuuwl) (pib:5)
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:08] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 1:30] Paul DuBois
Bug is not in any released 5.6.x version. No changelog entry needed.