Bug #38026 crash in maria code after multiple OR conditions are put in where clause
Submitted: 10 Jul 2008 17:19 Modified: 26 Sep 2008 10:13
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S1 (Critical)
Version:6.0.6, 6.0.8, 5.1.25-maria OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[10 Jul 2008 17:19] Shane Bester
Description:
running my app testsuite on 6.0.6-bzr from today's public tree got a crash after 500K queries.  It might be a maria or optimizer bug :

mysqld.exe!_ma_skip_pack_key()[ma_search.c:1172]
mysqld.exe!_ma_keynr()[ma_range.c:302]
mysqld.exe!_ma_search_pos()[ma_range.c:226]
mysqld.exe!_ma_record_pos()[ma_range.c:185]
mysqld.exe!maria_records_in_range()[ma_range.c:100]
mysqld.exe!ha_maria::records_in_range()[ha_maria.cc:2595]
mysqld.exe!handler::multi_range_read_info_const()[handler.cc:4072]
mysqld.exe!check_quick_select()[opt_range.cc:7480]
mysqld.exe!get_key_scans_params()[opt_range.cc:4752]
mysqld.exe!get_best_disjunct_quick()[opt_range.cc:3635]
mysqld.exe!SQL_SELECT::test_quick_select()[opt_range.cc:2402]
mysqld.exe!get_quick_record_count()[sql_select.cc:3759]
mysqld.exe!make_join_statistics()[sql_select.cc:4185]
mysqld.exe!JOIN::optimize()[sql_select.cc:1550]
mysqld.exe!mysql_select()[sql_select.cc:2987]
mysqld.exe!handle_select()[sql_select.cc:302]
mysqld.exe!execute_sqlcom_select()[sql_parse.cc:4847]
mysqld.exe!mysql_execute_command()[sql_parse.cc:2015]
mysqld.exe!mysql_open_cursor()[sql_cursor.cc:172]

The query in question was:
select * from `qa00` where `c2` = ? or `c3` = ? or `c4` = ? for update

I built relwithdebuginfo from this rev:

revno: 2674
committer: Matthias Leich <mleich@mysql.com>
branch nick: mysql-6.0-qa-funcs_1
timestamp: Thu 2008-07-03 12:33:10 +0200

How to repeat:
gypsy --queryfile=qa.sql --threads=5 --shuffle-queries=1 --duration=1000 --skip-syntax-warn=1

I'll reduce a testcase later.
[10 Jul 2008 17:26] Guilhem Bichot
Thanks Shane. Better retry with latest 6.0-maria tree, where we have pushed fixes for corruptions which you found (BUG#37276).
[10 Jul 2008 17:31] MySQL Verification Team
testcase. import repeatedly if a crash doesn't happen at first.

Attachment: bug38026.sql (application/unknown, text), 32.79 KiB.

[10 Jul 2008 17:53] MySQL Verification Team
Guilhem, I just built debug build of mysql-maria revision 2662 on windows 64-bit and testcase crashed after importing it twice.  Setting bug to verified.
[10 Jul 2008 20:08] Guilhem Bichot
The final SELECT shows:

==30284== Invalid read of size 1
==30284==    at 0x85164F2: _ma_skip_pack_key (ma_search.c:1172)
==30284==    by 0x85433C2: _ma_search_pos (ma_range.c:302)
==30284==    by 0x8543492: _ma_search_pos (ma_range.c:239)
==30284==    by 0x8543771: _ma_record_pos (ma_range.c:182)
==30284==    by 0x85439FD: maria_records_in_range (ma_range.c:97)
==30284==    by 0x8509DCD: ha_maria::records_in_range(unsigned, st_key_range*, s
t_key_range*) (ha_maria.cc:2764)
==30284==    by 0x8316C05: check_quick_keys(PARAM*, unsigned, SEL_ARG*, unsigned
 char*, unsigned, int, unsigned char*, unsigned, int) (opt_range.cc:7467)
==30284==    by 0x8316EB0: check_quick_select(PARAM*, unsigned, SEL_ARG*, bool)
(opt_range.cc:7242)
==30284==    by 0x83197ED: get_key_scans_params(PARAM*, SEL_TREE*, bool, bool, d
ouble) (opt_range.cc:4779)
==30284==    by 0x8319C87: get_best_disjunct_quick(PARAM*, SEL_IMERGE*, double)
(opt_range.cc:3659)
==30284==    by 0x832013D: SQL_SELECT::test_quick_select(THD*, Bitmap<64>, unsig
ned long long, unsigned long long, bool) (opt_range.cc:2352)
==30284==    by 0x82A7BB8: make_join_statistics(JOIN*, TABLE_LIST*, Item*, st_dy
namic_array*) (sql_select.cc:2428)
==30284==    by 0x82AB428: JOIN::optimize() (sql_select.cc:953)
==30284==    by 0x82B71E3: mysql_select(THD*, Item***, TABLE_LIST*, unsigned, Li
st<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st_order*, unsigned lon
g long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2366
)
==30284==    by 0x82B772E: handle_select(THD*, st_lex*, select_result*, unsigned
 long) (sql_select.cc:280)
==30284==    by 0x8241EB7: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.c
c:4774)
==30284==  Address 0x57cec4d is not stack'd, malloc'd or (recently) free'd

Problem specific of transactional tables.
This line of ma_search.c:
  if ((page_flag & KEYPAGE_FLAG_HAS_TRANSID) && key_has_transid(page-1))
makes Valgrind complain about 'page'.
[15 Jul 2008 18:47] Oleksandr Byelkin
I checked it on last maria+6.0 sources (many times and under valgrind) on my 32 and 64 bit systems (linux) and can't repeat it.

Feel free to open this bug again if you sill can repeat it.
[15 Jul 2008 20:21] Guilhem Bichot
Greetings from vacation. I have the latest 6.0-maria, 32-bit Linux, with latest revision:
[INS 22:18 /m/bzrrepos/mysql-6.0-maria2 $] bzr log -r-1
------------------------------------------------------------
revno: 2681
committer: Sergei Golubchik <serg@mysql.com>
branch nick: 6.0-maria
timestamp: Tue 2008-07-15 10:17:44 +0200
message:
  6.0 -> 6.0-maria merge
    ------------------------------------------------------------
    revno: 2630.22.33
    committer: Alexander Nozdrin <alik@mysql.com>
    branch nick: 6.0
    timestamp: Mon 2008-07-14 20:34:51 +0400
    message:
      Disable tests, which fail too often.
and I have run "make -j2".
Now:
[INS 22:19 /m/bzrrepos/mysql-6.0-maria2 $] cd mysql-test/
[INS 22:19 /m/bzrrepos/mysql-6.0-maria2/mysql-test $] ./mtr --mem --start-and-exit maria --valgrind
Logging: ./mtr --mem --start-and-exit maria --valgrind
MySQL Version 6.0.6
Using dynamic switching of binlog format
Using tmpfs in /dev/shm
Turning on valgrind for all executables
Running valgrind with options " --show-reachable=yes "
Using ndbcluster when necessary, mysqld supports it
Setting mysqld to support SSL connections
Binaries are debug compiled
Using "../libtool" when running valgrind or debugger
Using MTR_BUILD_THREAD      = 67
Using MASTER_MYPORT         = 10670
Using MASTER_MYPORT1        = 10671
Using SLAVE_MYPORT          = 10672
Using SLAVE_MYPORT1         = 10673
Using SLAVE_MYPORT2         = 10674
Using NDBCLUSTER_PORT       = 10675
Using NDBCLUSTER_PORT_SLAVE = 10676
Killing Possible Leftover Processes
Removing Stale Files
Creating Directories
Symlinking 'var' to '/dev/shm/var67'
Installing Master Database
=======================================================

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

Servers started, exiting
[INS 22:19 /m/bzrrepos/mysql-6.0-maria2/mysql-test $] ../client/mysql -u root -S var/tmp/master.sock test -A < /tmp/monty/bug38026.sql
[INS 22:19 /m/bzrrepos/mysql-6.0-maria2/mysql-test $] cat var/log/master.err
CURRENT_TEST: main.maria
==19411== Memcheck, a memory error detector.
==19411== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==19411== Using LibVEX rev 1804, a library for dynamic binary translation.
==19411== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==19411== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
==19411== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==19411== For more details, rerun with: -v
==19411==
080715 23:19:44 [Note] Plugin 'InnoDB' disabled by command line option
080715 23:19:49 [Note] Event Scheduler: Loaded 0 events
080715 23:19:49 [Note] /home/mysql_src/bzrrepos/mysql-6.0-maria2/sql/mysqld: ready for connections.
Version: '6.0.6-alpha-valgrind-max-debug-log'  socket: '/home/mysql_src/bzrrepos/mysql-6.0-maria2/mysql-test/var/tmp/master.sock'  port: 10670  Source distribution

So, now warnings; now I just run the SQL a second time:
[INS 22:20 /m/bzrrepos/mysql-6.0-maria2/mysql-test $] ../client/mysql -u root -S var/tmp/master.sock test -A < /tmp/monty/bug38026.sql
[INS 22:20 /m/bzrrepos/mysql-6.0-maria2/mysql-test $] cat var/log/master.err
CURRENT_TEST: main.maria
==19411== Memcheck, a memory error detector.
==19411== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==19411== Using LibVEX rev 1804, a library for dynamic binary translation.
==19411== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==19411== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
==19411== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==19411== For more details, rerun with: -v
==19411==
080715 23:19:44 [Note] Plugin 'InnoDB' disabled by command line option
080715 23:19:49 [Note] Event Scheduler: Loaded 0 events
080715 23:19:49 [Note] /home/mysql_src/bzrrepos/mysql-6.0-maria2/sql/mysqld: ready for connections.
Version: '6.0.6-alpha-valgrind-max-debug-log'  socket: '/home/mysql_src/bzrrepos/mysql-6.0-maria2/mysql-test/var/tmp/master.sock'  port: 10670  Source distribution
==19411== Thread 15:
==19411== Invalid read of size 1
==19411==    at 0x8661B02: _ma_skip_pack_key (ma_search.c:1172)
==19411==    by 0x868E9A2: _ma_search_pos (ma_range.c:302)
==19411==    by 0x868EA72: _ma_search_pos (ma_range.c:239)
==19411==    by 0x868ED51: _ma_record_pos (ma_range.c:182)
==19411==    by 0x868EFDD: maria_records_in_range (ma_range.c:97)
==19411==    by 0x86557DD: ha_maria::records_in_range(unsigned, st_key_range*, st_key_range*) (ha_maria.cc:2647)
==19411==    by 0x83923B1: handler::multi_range_read_info_const(unsigned, st_range_seq_if*, void*, unsigned, unsigned*, unsigned*, COST_VECT*) (handler.cc:4071)
==19411==    by 0x83778B6: check_quick_select(PARAM*, unsigned, bool, SEL_ARG*, bool, unsigned*, unsigned*, COST_VECT*) (opt_range.cc:7480)
==19411==    by 0x837F6FE: get_key_scans_params(PARAM*, SEL_TREE*, bool, bool, double) (opt_range.cc:4751)
==19411==    by 0x837FAB2: get_best_disjunct_quick(PARAM*, SEL_IMERGE*, double) (opt_range.cc:3635)
==19411==    by 0x8386203: SQL_SELECT::test_quick_select(THD*, Bitmap<64>, unsigned long long, unsigned long long, bool, bool) (opt_range.cc:2402)
==19411==    by 0x83051A3: make_join_statistics(JOIN*, TABLE_LIST*, Item*, st_dynamic_array*) (sql_select.cc:3758)
==19411==    by 0x830D681: JOIN::optimize() (sql_select.cc:1549)
==19411==    by 0x8318E16: mysql_select(THD*, Item***, TABLE_LIST*, unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2987)
==19411==    by 0x831935E: handle_select(THD*, st_lex*, select_result*, unsigned long) (sql_select.cc:302)
==19411==    by 0x829DC47: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:4856)
==19411==  Address 0x75737f5 is 181 bytes inside a block of size 576 free'd
==19411==    at 0x40233FC: free (vg_replace_malloc.c:323)
==19411==    by 0x876C02C: my_no_flags_free (my_malloc.c:59)
==19411==    by 0x876C874: free_root (my_alloc.c:349)
==19411==    by 0x82AA324: dispatch_command(enum_server_command, THD*, char*, unsigned) (sql_parse.cc:1461)
==19411==    by 0x82AA595: do_command(THD*) (sql_parse.cc:724)
==19411==    by 0x82987F0: handle_one_connection (sql_connect.cc:1153)
==19411==    by 0x406A111: start_thread (in /lib/libpthread-2.5.so)
==19411==    by 0x42B12ED: clone (in /lib/libc-2.5.so)
[INS 22:20 /m/bzrrepos/mysql-6.0-maria2/mysql-test $]
[15 Jul 2008 20:54] Oleksandr Byelkin
So it should be maria test instead attached to the bug report?
[15 Jul 2008 21:25] Oleksandr Byelkin
OK, I got it was just to start server.
But I can't repeat it by this recipe:
bell@hplaptop ~/mysql/bzr/work-maria-test/mysql-test $ cat var/log/master.err
CURRENT_TEST: main.maria
==11235== Memcheck, a memory error detector.
==11235== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==11235== Using LibVEX rev 1732, a library for dynamic binary translation.
==11235== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==11235== Using valgrind-3.2.3, a dynamic binary instrumentation framework.
==11235== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==11235== For more details, rerun with: -v
==11235==
080716  0:17:04 [Note] Plugin 'InnoDB' disabled by command line option
080716  0:17:12 [Note] Event Scheduler: Loaded 0 events
080716  0:17:12 [Note] /home/bell/mysql/bzr/work-maria-test/sql/mysqld: ready for connections.
Version: '6.0.6-alpha-valgrind-max-debug-log'  socket: '/home/bell/mysql/bzr/work-maria-test/mysql-test/var/tmp/master.sock'  port: 9306  Source distribution
080716  0:21:26 [Note] Got signal 15 to shutdown mysqld
080716  0:21:26 [Note] /home/bell/mysql/bzr/work-maria-test/sql/mysqld: Normal shutdown

080716  0:21:26 [Note] Event Scheduler: Purging the queue. 0 events
bell@hplaptop ~/mysql/bzr/work-maria-test/mysql-test $ 

Could you please say:
1) is it clean build? (my is clean)
2) which script used to build? (I used pentium-valgrind-max and amd64-valgrind-max)
3) which compiler used? (4.1.2)
4) Maybe some other ideas why the result is so different?
[16 Jul 2008 8:19] MySQL Verification Team
happened on DELETE .. WHERE ..=.. also:

mysqld.exe!_ma_skip_pack_key()[ma_search.c:1159]
mysqld.exe!_ma_keynr()[ma_range.c:302]
mysqld.exe!_ma_search_pos()[ma_range.c:226]
mysqld.exe!_ma_record_pos()[ma_range.c:184]
mysqld.exe!maria_records_in_range()[ma_range.c:100]
mysqld.exe!ha_maria::records_in_range()[ha_maria.cc:2765]
mysqld.exe!check_quick_keys()[opt_range.cc:7467]
mysqld.exe!check_quick_select()[opt_range.cc:7242]
mysqld.exe!get_key_scans_params()[opt_range.cc:4779]
mysqld.exe!SQL_SELECT::test_quick_select()[opt_range.cc:2305]
mysqld.exe!SQL_SELECT::check_quick()[opt_range.h:718]
mysqld.exe!mysql_delete()[sql_delete.cc:175]
mysqld.exe!mysql_execute_command()[sql_parse.cc:3084]
mysqld.exe!mysql_parse()[sql_parse.cc:5656]
mysqld.exe!dispatch_command()[sql_parse.cc:1131]
mysqld.exe!do_command()[sql_parse.cc:789]
mysqld.exe!handle_one_connection()[sql_connect.cc:1115]
mysqld.exe!pthread_start()[my_winthread.c:86]
mysqld.exe!_callthreadstart()[thread.c:295]
mysqld.exe!_threadstart()[thread.c:277]
kernel32.dll!BaseThreadStart()
ariables.
 invalid and cause the dump to abort...
0004EB8120=delete from `table66` where `col32` = 'diners'
[16 Jul 2008 8:28] Guilhem Bichot
See also Shane's previous post.
1) is it clean build? (my is clean)
you can assume yes, as both Shane on Windows and me on Linux got similar stack traces
2) which script used to build? (I used pentium-valgrind-max and amd64-valgrind-max)
pentium-valgrind-max
3) which compiler used? (4.1.2)
4.1.2, probably irrelevant as Shane got it on Windows
4) Maybe some other ideas why the result is so different?
Try running the SQL several times against the same mysqld; for me it took two runs.
If it does not work, try another machine in Uppsala.
[16 Jul 2008 8:36] MySQL Verification Team
Guilhem, Sanja, if you wait another day or two I'll upload a C testcase that will cause this crash, as well as all the others (38111, 38113,38114,38116,38135 ...).
[16 Jul 2008 10:00] Oleksandr Byelkin
I'd prefer sequence of statement (simpler to work) but if nothing other helps C code is OK.

And yes I run it about 10 times (at least more then 5 for sure) maybe it was just wrong phase of the moot I'll rebuild and retry at once
[16 Jul 2008 11:12] Oleksandr Byelkin
OK, I was able to repeat it, the trick was do not restart server before second try.
[16 Jul 2008 21:43] Oleksandr Byelkin
It is right test suite script to repeat the bug for sure (without different connection it shows other valgrind problems)

Attachment: test.test (text/plain), 65.77 KiB.

[22 Aug 2008 14:19] Sergei Golubchik
Cannot repeat it with either Sanja's or Shane's test case
[3 Sep 2008 21:08] Guilhem Bichot
I can repeat it with latest mysql-maria, following instructions posted in
[15 Jul 22:21] (note well: run the SQL twice). I get:
==4757== Thread 4:
==4757== Invalid read of size 1
==4757==    at 0x85EE09B: _ma_skip_pack_key (ma_search.c:1171)
==4757==    by 0x8626844: _ma_keynr (ma_range.c:302)
==4757==    by 0x862647E: _ma_search_pos (ma_range.c:226)
==4757==    by 0x86264FE: _ma_search_pos (ma_range.c:239)
==4757==    by 0x86261DC: _ma_record_pos (ma_range.c:182)
==4757==    by 0x8625E98: maria_records_in_range (ma_range.c:97)
==4757==    by 0x85D9AF7: ha_maria::records_in_range(unsigned, st_key_range*, st_key_range*) (ha_maria.cc:2793)
==4757==    by 0x8390925: check_quick_keys(PARAM*, unsigned, SEL_ARG*, unsigned char*, unsigned, int, unsigned char*, unsigned, int) (opt_
range.cc:7467)
==4757==    by 0x8390C33: check_quick_select(PARAM*, unsigned, SEL_ARG*, bool) (opt_range.cc:7242)
==4757==    by 0x8390F7C: get_key_scans_params(PARAM*, SEL_TREE*, bool, bool, double) (opt_range.cc:4779)
==4757==    by 0x83979E9: get_best_disjunct_quick(PARAM*, SEL_IMERGE*, double) (opt_range.cc:3659)
==4757==    by 0x8398EF7: SQL_SELECT::test_quick_select(THD*, Bitmap<64>, unsigned long long, unsigned long long, bool) (opt_range.cc:2352
)
==4757==    by 0x830DFAA: get_quick_record_count(THD*, SQL_SELECT*, st_table*, Bitmap<64> const*, unsigned long long) (sql_select.cc:2427)
==4757==    by 0x831233B: make_join_statistics(JOIN*, TABLE_LIST*, Item*, st_dynamic_array*) (sql_select.cc:2836)
==4757==    by 0x8314CCE: JOIN::optimize() (sql_select.cc:952)
==4757==    by 0x8318228: mysql_select(THD*, Item***, TABLE_LIST*, unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st
_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2365)
==4757==  Address 0x59c1295 is not stack'd, malloc'd or (recently) free'd
[26 Sep 2008 10:13] Michael Widenius
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

The reason for the crash was that skip_record didn't work with NULL CHAR keys. This was also the cause for the Bug #39243 "SELECT WHERE does not find row"