Bug #21658 Partition test fails, --ps-protocol
Submitted: 15 Aug 2006 19:15 Modified: 2 Sep 2006 5:22
Reporter: Patrick Galbraith Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:5.1.12-beta-debug OS:Linux (Suse 10.0, Linux AMD64-bit)
Assigned to: Mikael Ronström CPU Architecture:Any

[15 Aug 2006 19:15] Patrick Galbraith
Description:
When doing a 5.1-engines to 5.1 main merge, I received a failure 
on ./mysql-test-run.pl --ps-protocol partition (using autopush)

Trace:

Program received signal SIGSEGV, Segmentation fault.
0x00000000005dee47 in THD::rollback_item_tree_changes (
    this=<value optimized out>) at sql_class.cc:942
942         *change->place= change->old_value;
(gdb) where
#0  0x00000000005dee47 in THD::rollback_item_tree_changes (
    this=<value optimized out>) at sql_class.cc:942
#1  0x000000000066d7ef in Prepared_statement::cleanup_stmt (this=0x1c5f8a8)
    at sql_prepare.cc:2700
#2  0x000000000066e829 in Prepared_statement::execute (this=0x1c5f8a8, 
    expanded_query=<value optimized out>, open_cursor=false)
    at sql_prepare.cc:2938
#3  0x000000000066ed6d in mysql_stmt_execute (thd=0x1c453a8, 
    packet_arg=<value optimized out>, packet_length=<value optimized out>)
    at sql_prepare.cc:2257
#4  0x0000000000610ad6 in dispatch_command (command=COM_STMT_EXECUTE, 
    thd=0x1c453a8, packet=0x1c49759 "", packet_length=10) at sql_parse.cc:1791
#5  0x0000000000611d48 in do_command (thd=0x1c453a8) at sql_parse.cc:1614
#6  0x0000000000612994 in handle_one_connection (arg=<value optimized out>)
    at sql_parse.cc:1228
#7  0x00002aaaaade0fa5 in start_thread () from /lib64/tls/libpthread.so.0
#8  0x00002aaaab353cb2 in clone () from /lib64/tls/libc.so.6
#9  0x0000000000000000 in ?? ()
#10 0x0000000000000000 in ?? ()

How to repeat:
./mysql-test-run.pl --ps-protocol partition

I can clone this tree to production if needed.
[16 Aug 2006 9:05] Tonci Grgin
Verified as described by reporter.

munja:/home/Tonci/bkwork/mysql-5.1 # bk changes|head
ChangeSet@1.2282, 2006-08-15 21:49:26-07:00, brian@zim.(none) +461 -0
  Cheery  picked Antony's patch to allow for Window's builds in 5.1
ChangeSet@1.2278.2.1, 2006-08-15 20:38:51+02:00, joerg@trift2. +1 -0
  Remove building, packing, and installing the "max" server from the RPM "spec" file.
ChangeSet@1.2278.1.3, 2006-08-15 15:49:31+02:00, mskold@mysql.com +1 -0
  bug #18184  SELECT ... FOR UPDATE does not work..: Updated test result file
ChangeSet@1.2278.1.2, 2006-08-15 14:39:48+02:00, mskold@linux.site +5 -0

[root@FC5X64 mysql-test]# uname -a
Linux FC5X64 2.6.16-1.2111_FC5 #1 SMP Thu May 4 21:16:04 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux

[root@FC5X64 mysql-test]# gdb /home/svi/mysql-5-1/libexec/mysqld /home/svi/mysql-5-1/mysql-test/var/log/partition/core.12698
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1".

Core was generated by `/home/svi/mysql-5-1/libexec/mysqld --no-defaults --console --basedir=/home/svi/'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libnsl.so.1...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
#0  0x0000003674a0a2e3 in pthread_kill () from /lib64/libpthread.so.0

(gdb) where
#0  0x0000003674a0a2e3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000006f0d98 in write_core (sig=11) at stacktrace.c:220
#2  0x00000000005a4c0e in handle_segfault (sig=11) at mysqld.cc:2158
#3  <signal handler called>
#4  0x000000000058fd5d in THD::rollback_item_tree_changes (this=0x15ca938)
    at sql_class.cc:942
#5  0x0000000000622705 in Prepared_statement::cleanup_stmt (this=0x15e89e8)
    at sql_prepare.cc:2700
#6  0x0000000000624069 in Prepared_statement::execute (this=0x15e89e8, expanded_query=Variable "expanded_query" is not available.

) at sql_prepare.cc:2938
#7  0x0000000000625c58 in mysql_stmt_execute (thd=0x15ca938, packet_arg=Variable "packet_arg" is not available.
)
    at sql_prepare.cc:2257
#8  0x00000000005c42ae in dispatch_command (command=COM_STMT_EXECUTE, thd=0x15ca938,
    packet=0x15d1569 "", packet_length=10) at sql_parse.cc:1791
#9  0x00000000005c5516 in do_command (thd=0x15ca938) at sql_parse.cc:1614
#10 0x00000000005c6168 in handle_one_connection (arg=Variable "arg" is not available.
) at sql_parse.cc:1228
#11 0x0000003674a062ea in start_thread () from /lib64/libpthread.so.0
#12 0x00000036729cb73d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
(gdb)
[17 Aug 2006 14:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10585

ChangeSet@1.2284, 2006-08-17 16:01:33+02:00, tnurnberg@salvation.intern.azundris.com +1 -0
  temporarily disabled partition test, fails w/ PS protocol, filed as Bug#21658
[17 Aug 2006 14:05] Mikael Ronström
Disabling a test case might be a good idea but it is not a patch to fix the bug.
[17 Aug 2006 14:07] Tatiana Azundris Nuernberg
FWIW, I also see this on 32 bit (SuSE 10.0) and Ubunta 5.10 on 64 bit.
[17 Aug 2006 14:09] Mikael Ronström
Reproduced it on my RHEL 4 normal working system so can use normal debugging tools.
Doesn't look like a partitioning bug though but we'll see what I'll find at closer analysis.
[18 Aug 2006 11:20] Kristian Nielsen
Some more information of the exact cause of this bug:

The crash in THD::rollback_item_tree_changes occurs when it walks the list of Item ** values to restore to old values. The memory in which the list elements are stored has been overwritten.

Using a gdb watchpoint, I found the point where it is overwritten:

(gdb) bt
#0  0x401261f7 in memset () from /lib/tls/libc.so.6
#1  0x08edecb8 in ?? ()
#2  0x0858c884 in _myfree (ptr=0x8edecd0 '\217' <repeats 200 times>..., 
    filename=0x8700553 "my_alloc.c", lineno=353, myflags=0) at safemalloc.c:310
#3  0x0858b8f8 in free_root (root=0x41e77a28, MyFlags=0) at my_alloc.c:353
#4  0x08266dc4 in closefrm (table=0x41e77400, free_share=false)
    at table.cc:1640
#5  0x082f5664 in ha_create_table (thd=0x8edd2c8, path=0x8f8f8f8f "", 
    db=0x8f1db38 "test", table_name=0x8f1d970 "t1", create_info=0x8f0e9d8, 
    update_create_info=false) at handler.cc:2523
#6  0x082d3cfb in rea_create_table (thd=0x8edd2c8, 
    path=0x41e77da0 "./test/t1", db=0x8f1db38 "test", 
    table_name=0x8f1d970 "t1", create_info=0x8f0e9d8, 
    create_fields=@0x8f8f8f8f, keys=2408550287, key_info=0x8f8f8f8f, 
    file=0x8f0ebf8) at unireg.cc:356
#7  0x08354daa in mysql_create_table_internal (thd=0x8edd2c8, 
    db=0x8f1db38 "test", table_name=0x8f1d970 "t1", lex_create_info=0x8f0b5a8, 
    fields=@0x8f0b4c0, keys=@0x8f0b4b4, internal_tmp_table=false, 
    select_field_count=2408550287, use_copy_create_info=true)
    at sql_table.cc:3487
#8  0x08355578 in mysql_create_table (thd=0x8edd2c8, db=0x8f1db38 "test", 
    table_name=0x8f8f8f8f "", create_info=0x8f8f8f8f, fields=@0x8f8f8f8f, 
    keys=@0x8f8f8f8f, internal_tmp_table=false, select_field_count=2408550287, 
---Type <return> to continue, or q <return> to quit---
    use_copy_create_info=true) at sql_table.cc:3564
#9  0x0823240a in mysql_execute_command (thd=0x8edd2c8) at sql_parse.cc:3022
#10 0x08296838 in Prepared_statement::execute (this=0x8f0afa0, 
    expanded_query=0x41e790c0, open_cursor=false) at sql_prepare.cc:2885
#11 0x08296ce9 in mysql_stmt_execute (thd=0x8edd2c8, packet_arg=0x0, 
    packet_length=2408550287) at sql_prepare.cc:2256
#12 0x08234491 in dispatch_command (command=COM_STMT_EXECUTE, thd=0x8edd2c8, 
    packet=0x8f06791 "\001", packet_length=10) at sql_parse.cc:1791
#13 0x08235ac1 in do_command (thd=0x8edd2c8) at sql_parse.cc:1614
#14 0x0823670a in handle_one_connection (arg=0x8edd2c8) at sql_parse.cc:1228
#15 0x40047a13 in start_thread () from /lib/tls/libpthread.so.0
#16 0x401759da in clone () from /lib/tls/libc.so.6

So it seems free_root() is called on the memory allocated by THD::nocheck_register_item_tree_change() to store the list of Item ** values.
Afterwards, rollback_item_tree_changes() is called, accessing invalid memory.
[19 Aug 2006 16:59] Mikael Ronström
The problem was related to using thd->change_item_tree and later cleaning up by using
rollback_item_tree_change, however during a create table the memory addressed by the
change_list is released before the rollback_item_tree_change is called and also overwritten
and thus crashing.

The solution idea is to create an arena before calling fix_fields in fix_partition_func such that
the code believes we are doing a preparation of a prepared statement.
[30 Aug 2006 16:07] Konstantin Osipov
Reviewed by email, approved with a few comments.
[31 Aug 2006 20:41] Mikael Ronström
fixed in 5.1-kt team tree
[1 Sep 2006 7:59] Mikael Ronström
Patch will appear in 5.1.12
[2 Sep 2006 5:22] Jon Stephens
Documented bugfix in 5.1.12 changelog.