Bug #40954 Crash in MyISAM index code with concurrency test using partitioned tables
Submitted: 23 Nov 2008 10:33 Modified: 10 Dec 2008 22:30
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:5.1.29, 5.1.30 OS:Linux
Assigned to: Mattias Jonsson CPU Architecture:Any
Triage: Triaged: D1 (Critical)

[23 Nov 2008 10:33] Guilhem Bichot
Description:
Running a concurrency test (like those in pushbuild2) against partitioned MyISAM tables crashes 5.1-main.

Context: after a merge of 5.1-main into 5.1-maria, we start seeing crashes in storage/maria/ma_rkey.c for test maria_dml_alter.yy in pushbuild2 (see URL).
But, as I could repeat, the same crash happens with MyISAM in 5.1-main (see how-to-repeat). In pushbuild2 it showed up only in Maria because pushbuild2 does not run concurrency tests on old engines like MyISAM.

How to repeat:
Take latest 5.1-main, revision:
2699 Joerg Bruehe      2008-11-19 [merge]
      revision-id:joerg@mysql.com-20081119120434-vn0hmyrf545ytcka
      Merge the 5.0.72 build tag up into 5.1, no source change.

Build with BUILD/compile-pentium-valgrind-max

Pull latest mysql-test-extra-6.0 from our internal machine, and do
cd mysql-test/gentest
there, create a file named conf/myisam_dml_alter.yy (attached to this bug report) and a file named conf/myisam.zz (attached). Those two files are what pushbuild2 runs against Maria, but here tuned for MyISAM.
Then run:
./runall.pl --basedir=/m/bzrrepos/mysql-5.1 --engine=Myisam --grammar=conf/myisam_dml_alter.yy --queries=100000 --reporters=Deadlock --gendata=conf/myisam.zz                        
(replace /m/bzrrepos/mysql-5.1 by your 5.1-main directory).
After maximum a few minutes it should crash. I got this stack trace:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1222669424 (LWP 19831)]
0x0859ca87 in _mi_pack_key (info=0x8ac6ad8, keynr=0, key=0x8ac7f76 "", 
    old=0x0, keypart_map=0, last_used_keyseg=0xb71f68dc) at mi_key.c:291
291             *key++ = *--pos;
Current language:  auto; currently c
(gdb) bt
#0  0x0859ca87 in _mi_pack_key (info=0x8ac6ad8, keynr=0, key=0x8ac7f76 "", 
    old=0x0, keypart_map=0, last_used_keyseg=0xb71f68dc) at mi_key.c:291
#1  0x08594a51 in mi_rkey (info=0x8ac6ad8, buf=0x8bb2c90 "ÿÿ", inx=0, key=0x0, 
    keypart_map=1, search_flag=HA_READ_AFTER_KEY) at mi_rkey.c:60
#2  0x085874b6 in ha_myisam::index_read_map (this=0x8b77100, 
    buf=0x8bb2c90 "ÿÿ", key=0x0, keypart_map=1, find_flag=HA_READ_AFTER_KEY)
    at ha_myisam.cc:1607
#3  0x083bd965 in handler::read_range_first (this=0x8b77100, 
    start_key=0x8b857c8, end_key=0x8b856c8, eq_range_arg=false, sorted=true)
    at handler.cc:4155
#4  0x083c96a5 in ha_partition::handle_ordered_index_scan (this=0x8b85638, 
    buf=0x8bb2c90 "ÿÿ", reverse_order=false) at ha_partition.cc:4459
#5  0x083c9f57 in ha_partition::common_index_read (this=0x8b85638, 
    buf=0x8bb2c90 "ÿÿ", have_start_key=false) at ha_partition.cc:3838
#6  0x083ca103 in ha_partition::read_range_first (this=0x8b85638, 
    start_key=0x0, end_key=0x8a98888, eq_range_arg=false, sorted=true)
    at ha_partition.cc:4094
#7  0x083bdb9c in handler::read_multi_range_first (this=0x8b85638, 
    found_range_p=0xb71f6b90, ranges=0x8a98878, range_count=1, sorted=true, 
    buffer=0x0) at handler.cc:4029
#8  0x0839bdf9 in QUICK_RANGE_SELECT::get_next (this=0x8ac8e68)
    at opt_range.cc:8444
#9  0x083b51db in rr_quick (info=0x8ad24b8) at records.cc:313
---Type <return> to continue, or q <return> to quit---
#10 0x08309396 in join_init_read_record (tab=0x8ad2478) at sql_select.cc:11738
#11 0x08308c4c in sub_select (join=0x8e01e48, join_tab=0x8ad2478, 
    end_of_records=false) at sql_select.cc:11066
#12 0x083110b5 in do_select (join=0x8e01e48, fields=0x8a8c83c, table=0x0, 
    procedure=0x0) at sql_select.cc:10823
#13 0x0832af6e in JOIN::exec (this=0x8e01e48) at sql_select.cc:2182
#14 0x08325d36 in mysql_select (thd=0x8a8b400, rref_pointer_array=0x8a8c8b0, 
    tables=0xb6e4a868, wild_num=1, fields=@0x8a8c83c, conds=0xb6e8f448, 
    og_num=1, order=0xb6e8f5f8, group=0x0, having=0x0, proc_param=0x0, 
    select_options=2148289024, result=0x8b4ab80, unit=0x8a8c524, 
    select_lex=0x8a8c7a4) at sql_select.cc:2361
#15 0x0832b2a1 in handle_select (thd=0x8a8b400, lex=0x8a8c4c8, 
    result=0x8b4ab80, setup_tables_done_option=0) at sql_select.cc:269
#16 0x0829e9df in execute_sqlcom_select (thd=0x8a8b400, all_tables=0xb6e4a868)
    at sql_parse.cc:4890
#17 0x082a0380 in mysql_execute_command (thd=0x8a8b400) at sql_parse.cc:2184
#18 0x082a909c in mysql_parse (thd=0x8a8b400, 
    inBuf=0xb6e9b3e0 "SELECT * FROM `table0_myisam_dynamic_key_pk_parts_2_int_autoinc` AS X WHERE X . `pk` < '2008-04-15 00:35:27' ORDER BY X . `pk` LIMIT 8", 
    length=134, found_semicolon=0xb71f826c) at sql_parse.cc:5789
#19 0x082a9c46 in dispatch_command (command=COM_QUERY, thd=0x8a8b400, 
    packet=0x8e74859 "", packet_length=134) at sql_parse.cc:1200
#20 0x082aadc2 in do_command (thd=0x8a8b400) at sql_parse.cc:857
---Type <return> to continue, or q <return> to quit---
#21 0x08297c14 in handle_one_connection (arg=0x8a8b400) at sql_connect.cc:1115
#22 0xb7f7f112 in start_thread () from /lib/libpthread.so.0
#23 0xb7e842ee in clone () from /lib/libc.so.6
[23 Nov 2008 10:38] Guilhem Bichot
Probably a recently introduced bug: the previous merge of 5.1main into 5.1maria was revision:
 2736 Georgi Kodinov    2008-09-10 [merge]
      revision-id:kgeorge@mysql.com-20080910095538-o1vuju66760i73up
      merged 5.0-bugteam to 5.1-bugteam
[24 Nov 2008 14:28] Mattias Jonsson
This was probably introduced when back porting the read_range implementation in partitioning (bug#37721).

This seems to fix it:
=== modified file 'sql/ha_partition.cc'
--- sql/ha_partition.cc	2008-11-10 20:13:24 +0000
+++ sql/ha_partition.cc	2008-11-24 14:25:33 +0000
@@ -4490,7 +4490,8 @@
         This can only read record to table->record[0], as it was set when
         the table was being opened. We have to memcpy data ourselves.
       */
-      error= file->read_range_first(&m_start_key, end_range, eq_range, TRUE);
+      error= file->read_range_first(m_start_key.key? &m_start_key: NULL,
+                                    end_range, eq_range, TRUE);
       memcpy(rec_buf_ptr, table->record[0], m_rec_length);
       reverse_order= FALSE;
       break;

I will try to find a simpler test case.
[24 Nov 2008 16:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59696

2708 Mattias Jonsson	2008-11-24
      Bug#40954: Crash in MyISAM index code with concurrency test using partitioned tables
      
      Problem was usage of read_range_first with an empty key.
      
      Solution was to not to give a key if it was empty.
[24 Nov 2008 18:19] Mattias Jonsson
The back port was not done in bug#37721, it was done for solving bug#30573/bug33555, and was a backport of the patch for bug#33257.
[24 Nov 2008 20:59] Omer Barnir
Did some more testing on the above (all the following done with RQG and the above yy and zz files):
 - No crashes with 5.1.28, crashes with 5.1.29
 - Limiting the scenarios to selects that only have pk in the where
   clause - still crashes
 - Limiting the scenarios to selects that only have pk with non-zero values
   clause - still crashes
 - If the where clause includes a different value that is not the pk - no crash
 - If no partitions - no crash
 - If the partitions are not based on the auto_increment pk - no crash
[24 Nov 2008 21:13] Mikael Ronström
The bug was that in implementing read_range_first in
the partition handler the partition handler used
two calls to read_range_first, one for ordered
scans and one for unordered scans. The unordered
scan was correctly implemented and the ordered scan
incorrectly.

The incorrection was that the use of no start key
is a NULL start key and not start_key->key == NULL
as was done in this call. Thus the storage engine
API was broken in that call.

I see no particular risk involved in fixing it.
[24 Nov 2008 21:25] Omer Barnir
If partitioned are based on the pk that is not auto incremented - still crash
[24 Nov 2008 22:34] Mattias Jonsson
The bug affects the following statements for partitioned tables:
SELECT * FROM part_table WHERE a < const ORDER BY a
SELECT * FROM part_table WHERE const < a ORDER BY a desc

Where 'a' is an indexed column (does not need to be in the partitioning expression).

It must be an 'open' range and ordered index scan where it starts on the 'open' side.

It does not depend on partitioning type or expression.

The code that include the bug has been in 6.0 since 6.0.5 (see bug#33257) and was backported to 5.1.29 (for bug#30573 / bug#33555).
[24 Nov 2008 23:36] Mattias Jonsson
pushed into mysql-5.1-bugteam and mysql-6.0-bugteam
[25 Nov 2008 2:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59746

2778 Build Team	2008-11-25
      mysql-test/r/partition.result
      mysql-test/t/partition.test
      sql/ha_partition.cc
        Bug#40954: Crash in MyISAM index code with concurrency test using partitioned tables
        Problem was usage of read_range_first with an empty key.
        Solution was to not to give a key if it was empty. (real author Mattias Jonsson)
      
      storage/archive/archive_reader.c
      client/mysqlslap.c
        Aligned the copyright texts output from "--version" of tools, to
        let internal tools be able to change them if needed.
      
      storage/ndb/test/tools/connect.cpp
      storage/ndb/test/run-test/atrt.hpp
        Corrected a few GPL headers not restricted to GPL version 2
      
      Makefile.am
        Added missing --report-features to the 'test-bt-fast' target
      
      support-files/mysql.spec.sh
        Reversed the removal of the "%define license GPL" in as internal
        tools depended on it
[25 Nov 2008 2:25] Trudy Pelzer
SysQA testing of this bug reveals that the use case is more limited 
than first indicated; some of the difference may be that there are 
differences between 5.1.30 and the 5.1-main tree. Thus, though this 
is a regression introduced in 5.1.29, it is also a low-impact issue. 

Due to the extreme time-criticalness of 5.1.30, I don't believe
we have the time to *safely* patch 5.1.30/retest/rebuild and still
release by our due date. My decision (with which Jeffrey agrees) 
is that this is therefore not a showstopper for 5.1.30. Changing 
tag from 'trg_rfg' to 'sr51mru' but also setting as showstopper for 
5.1.31.
[25 Nov 2008 19:29] Omer Barnir
SysQA tests on 5.1.30 show the following:

- The crash is limited to queries like these:
SELECT * FROM part_table WHERE a < const ORDER BY a
and SELECT * FROM part_table WHERE a <> const ORDER BY a

That is, crash happens only if the WHERE clause references
an indexed column that is also the partition key, includes
a value found in the partition and uses the < or <> operators,
and if that same column is used in the ORDER BY clause with
an ASC sort priority.
*** If the where clause includes a value not found in the
primary key, there is no crash.
*** If the table is not partitioned, there is no crash.
*** If there is no ORDER BY, there is no crash.
*** If the ORDER BY sorts only by columns not referred to
in the WHERE clause, there is no crash.
*** If the operator in the WHERE clause is = or >,
there is no crash.
*** If the WHERE clause uses < or <> and ORDER BY is DESC, there
is no crash

A workarownd for the above crashes with < and <> is to add an 
   "AND X >  [some small value]" to the qeury as in:

SELECT * FROM part_table WHERE a < const AND a > -1000 ORDER BY a
or SELECT * FROM part_table WHERE a <> AND a > -1000 ORDER BY a
[26 Nov 2008 12:28] Bugs System
Pushed into 5.1.30-ndb-6.2.17  (revid:bteam@astra04-20081125020458-m0us9vykrs08b9xm) (version source revid:tomas.ulin@sun.com-20081126122600-j192inmjrcoq1s2i) (pib:5)
[26 Nov 2008 12:59] Bugs System
Pushed into 5.1.30-ndb-6.3.20  (revid:bteam@astra04-20081125020458-m0us9vykrs08b9xm) (version source revid:tomas.ulin@sun.com-20081126125617-qdxm40iw9if7u559) (pib:5)
[26 Nov 2008 14:51] Jon Stephens
Documented in the NDB 6.2.17 and NDB 6.3.20 changelogs as follows:

-----

        A query on a partitioned table caused MySQL to crash, where the query 
        had the following characteristics: 

            · The query's WHERE clause referenced an indexed column 
            that was also in the partitioning key. 
            · The query's WHERE clause included a value found in the 
            partition. 
            · The query's WHERE clause used the < or <> operators to 
            make a comparison with the indexed column's value. 
            · The same indexed column was used in the query's ORDER 
            BY clause. 
            · The ORDER BY clause used an explcit or implicit ASC 
            sort priority. 

        Two examples of such a query are given here, where -a- 
        represents an indexed column used in the table's 
        partitioning key:

            1. SELECT * FROM table WHERE a < constant ORDER BY a;

            2. SELECT * FROM table WHERE a <> constant ORDER BY a;

[For NDB 6.2.17:] 
        This bug was introduced in MySQL Cluster NDB 6.2.16.

[For NDB 6.3.20:]
        This bug was introduced in MySQL Cluster NDB 6.3.19.

-----

Set bug status as NDI pending merges to NDB 6.4/mainline 5.1 and 6.0 trees.
[27 Nov 2008 14:52] Bugs System
Pushed into 5.1.30-ndb-6.4.0  (revid:bteam@astra04-20081125020458-m0us9vykrs08b9xm) (version source revid:tomas.ulin@sun.com-20081126125958-mz5j02g2jrri8en9) (pib:5)
[27 Nov 2008 19:37] Jon Stephens
Fix also documented in ndb-6.4.0 changelog (no changelog-specific note required this time). Returned to NDI status pending merges to mainline trees.
[8 Dec 2008 10:23] Bugs System
Pushed into 5.1.31  (revid:mattias.jonsson@sun.com-20081124162403-pocmhhov1hji9gnk) (version source revid:patrick.crews@sun.com-20081126180318-v685u61mpgoc176x) (pib:5)
[8 Dec 2008 11:34] Bugs System
Pushed into 6.0.9-alpha  (revid:mattias.jonsson@sun.com-20081124162403-pocmhhov1hji9gnk) (version source revid:satya.bn@sun.com-20081126062231-h6os2axygjw27wb4) (pib:5)
[10 Dec 2008 22:30] Jon Stephens
Fix now documented in the 5.1.31 and 6.0.9 changelogs, with the following notes added to above description.

[In 5.1.31 entry:] 
        This bug was introduced in MySQL 5.1.29.

[In 6.0.9 entry:]
        This bug was introduced in MySQL 6.0.5.

Closed.
[12 Dec 2008 23:30] Bugs System
Pushed into 6.0.9-alpha  (revid:bteam@astra04-20081125020458-m0us9vykrs08b9xm) (version source revid:tomas.ulin@sun.com-20081209185954-9svcixh2p5hsfi6w) (pib:5)
[28 Dec 2008 21:46] Bugs System
Pushed into 5.1.31 (revid:joerg@mysql.com-20081228151808-aou2s7esrmastj3e) (version source revid:bteam@astra04-20081126230210-d08cnsctdfsvfag8) (merge vers: 5.1.31) (pib:6)
[19 Jan 2009 11:34] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:10] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:16] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)