Bug #15140 MySQL crashes on RENAME after INSERT DELAYED
Submitted: 22 Nov 2005 16:35 Modified: 12 Jun 2006 8:57
Reporter: Carlos Uldérico Cirello Filho Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S2 (Serious)
Version:5.0.18, 5.0.16-standard OS:Linux (RH 9 (2.4.20-8smp))
Assigned to: Ingo Strüwing CPU Architecture:Any

[22 Nov 2005 16:35] Carlos Uldérico Cirello Filho
Description:
MySQL is crashing on erratic ways. There's no pattern of crashing. 

One of the recomendations of MySQL manual was applied. Several tables had their varchar fields swaped to char fields.

There's few resolved stacks.

0x814e7f8 handle_segfault + 356
0x40049618 _end + 934366152
(nil)
0x82a67e9 btr_search_build_page_hash_index + 509
0x82a4d98 btr_search_info_update_slow + 264
0x829a1a4 btr_cur_search_to_nth_level + 2068
0x827c0ef row_search_for_mysql + 6943
0x81f302b _ZN11ha_innobase10index_readEPcPKcj16ha_rkey_function + 303
0x81a0f1e _Z13join_read_keyP13st_join_table + 242
0x8192ff3 _Z10sub_selectP4JOINP13st_join_tableb + 255
0x8198ff5 _Z20evaluate_join_recordP4JOINP13st_join_tableiPc + 329
0x8192fa2 _Z10sub_selectP4JOINP13st_join_tableb + 174
0x8198ca9 _Z9do_selectP4JOINP4ListI4ItemEP8st_tableP9Procedure + 281
0x818f8b2 _ZN4JOIN4execEv + 4018
0x812b9e3 _ZN30subselect_single_select_engine4execEv + 95
0x8128127 _ZN14Item_subselect4execEv + 19
0x812905b _ZN17Item_in_subselect8val_boolEv + 19
0x80f0307 _ZN4Item15val_bool_resultEv + 15
0x8108c83 _ZN17Item_in_optimizer7val_intEv + 71
0x8198f59 _Z20evaluate_join_recordP4JOINP13st_join_tableiPc + 173
0x8192fa2 _Z10sub_selectP4JOINP13st_join_tableb + 174
0x8198ca9 _Z9do_selectP4JOINP4ListI4ItemEP8st_tableP9Procedure + 281
0x818f8b2 _ZN4JOIN4execEv + 4018
0x81903cd _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_mP13select_resultP18st_select_lex_unitP13st_sel + 305
0x818c6ba _Z13handle_selectP3THDP6st_lexP13select_resultm + 234
0x81621ad _Z21mysql_execute_commandP3THD + 669
0x8168d82 _Z11mysql_parseP3THDPcj + 306
0x8160aa6 _Z16dispatch_command19enum_server_commandP3THDPcj + 1178
0x81605d1 _Z10do_commandP3THD + 129
0x815fab1 handle_one_connection + 569
0x400442b6 _end + 934344806
0x420de407 _end + 968530359

----

0x814e7f8 handle_segfault + 356
0x40049618 _end + 934366152
0x40044dea _end + 934347674
0x4202726c _end + 967780380
0x40044dea _end + 934347674
0x81a9be2 handle_delayed_insert + 426
0x400442b6 _end + 934344806
0x420de407 _end + 968530359

How to repeat:
Situations acknowledged were:

INSERT DELAYED into TABLES (MyISAM, INNODB not checked) with varchar fields.

UPDATE LOW_PRIORITY tables (MyISAM AND INNODB, it seems) with no special kind of fields.

Suggested fix:
Eventually the thread crashes but the server doesn't. When this happens, just try again as one would with deadlocks.

When server crashes, safe_mysqld restarts it by itself. In this situation I see no workarounds.

As I am not acquainted with MySQL internals, there's no much I can suggested for fixing the bug.
[22 Nov 2005 17:03] Valeriy Kravchuk
Thank you for a problem report. Please, describe all the steps you performed to upgrade from 4.1. 

Your report looks very similar to some I saw already... Check http://bugs.mysql.com/bug.php?id=13707, for example. That bug should be fixed in 5.0.16 now available, so, please, try the newer version and inform about the results.
[22 Nov 2005 17:16] Carlos Uldérico Cirello Filho
1. Cleanly shutdown MySQL 4.1.13 thru mysqladmin shutdown
2. Backed up with regular cp -vr
3. Ungziped the MySQL 5.0.15 in the /usr/local
4. Updated the new my.cnf so as to not conflict INNODB files from the older relase.
The changes made were:
tmpdir          = /usr/local/mysql-5.0.15/tmp/
innodb_buffer_pool_size = 512M
innodb_additional_mem_pool_size = 60M
# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 512M
innodb_log_buffer_size = 128M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

5. Moved the data dir from one to another.
6. Started MySQL 5.0.15.

As you all can see, I've not fix privileges. And I am still using OLD PASSWORD style.
[22 Nov 2005 17:42] Valeriy Kravchuk
The first problem you got (with INSERT DELAYED on tables with varchar columns) is almost surely a duplicate of http://bugs.mysql.com/bug.php?id=13707.

So, please, try to check both types of your problematic statements with version 5.0.16 and inform about the results.

Can you restore 4.1.x back, make a dump of your data, then install 5.0.x and restore? This is the only known way of upgrade that almost always give you a proper results. Copying just data gives weird problems really often.
[22 Nov 2005 18:10] Carlos Uldérico Cirello Filho
It seems that the insert delayed is not crashing anymore.

But, after the upgrade I've made (which I did just gunziping, besides you advice), every single alter table crashes the server offering as stack trace these 3 seqüences:

On 
(Same table, MyISAM engine) 

ALTER TABLE 'table' ORDER BY field DESC

0x8150650 handle_segfault + 356
0x40038618 _end + 934226664
0x40033dea _end + 934208186
0x4202726c _end + 967710524
0x40033dea _end + 934208186
0x81ac76a handle_delayed_insert + 426
0x400332b6 _end + 934205318
0x420de407 _end + 968460503

ALTER TABLE RENAME (this try failed)
0x8150650 handle_segfault + 356
0x40038618 _end + 934226664
0x40033dea _end + 934208186
0x4202726c _end + 967710524
0x40033dea _end + 934208186
0x81ac76a handle_delayed_insert + 426
0x400332b6 _end + 934205318
0x420de407 _end + 968460503

ALTER TABLE RENAME (this is one worked)
0x8150650 handle_segfault + 356
0x40038618 _end + 934226664
0x40033dea _end + 934208186
0x4202726c _end + 967710524
0x40033dea _end + 934208186
0x81ac76a handle_delayed_insert + 426
0x400332b6 _end + 934205318
0x420de407 _end + 968460503
[23 Nov 2005 9:05] Heikki Tuuri
Hi!

The InnoDB crash is almost certainly this:

http://bugs.mysql.com/bug.php?id=14747

It is fixed in upcoming 5.0.17.

Regards,

Heikki
[23 Nov 2005 10:35] Carlos Uldérico Cirello Filho
Well it seems that it was just an INSERT DELAYED problem. By whatever were the fix in 5.0.16 patch, it solved the INSERT DELAYED problem, but tables that are receiving INSERT DELAYED queries cannot receive simultaneous ALTER TABLE queries. 

Thusly, it's not very much to do with the Engine.
[23 Nov 2005 18:51] Valeriy Kravchuk
Please, send the real SHOW CREATE TABLE for the tables, ALTER TABLE and INSERT DELAYED statements that gives you these crashes on 5.0.16.
[24 Nov 2005 12:07] Carlos Uldérico Cirello Filho
These are the queries that Valeriy Kravchuk has asked.

One VERY IMPORTANT thing: INSERT DELAYED's are not crashing anymore in 5.0.16, they crashed on 5.0.15. 
However, ALTER TABLE RENAME crashes on tables that receieves INSERT DELAYED queries concurrently.

I tested to rename a table with same structure by with same content but that was not processing queries, it DIDN'T crashed. I do believe that's something to do yet with DELAYED problems.

The queries.

CREATE TABLE `log` (
	`id_log` bigint(10) unsigned NOT NULL auto_increment,
	`table` varchar(255) default NULL,  
	`field` varchar(255) default NULL,  
	`OLD` varchar(255) default NULL,  
	`NEW` varchar(255) default NULL,  
	`user` varchar(50) default NULL,  
	`key_field` varchar(25) default NULL,  
	`key_value` varchar(255) default NULL,  
	`remark` varchar(255) default NULL,  
	`time` datetime default NULL,  
	`action` char(1) NOT NULL default 'u',  
	PRIMARY KEY  (`id_log`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

ALTER TABLE `log` RENAME `log_20051124` ;

INSERT DELAYED INTO log (`table`,`field`, `OLD`, `NEW`, `user`, `key_field`, `key_value`, `remark`, `time`, `action`) VALUES ('customer', 'co_id_commercial_20percent', '0', '', 'soroso', 'id_customer', '338', 'Cliente Fernando Antunes Panissi Da Silva Sauro(338)', NOW(), 'u'),('customer', 'email_porks', '0', '', 'soroso', 'id_customer', '338', 'Cliente Fernando Antunes Panissi Da Silva Sauro(338)', NOW(), 'u'),('customer', 'email_porks_second', '0', '', 'soroso', 'id_customer', '338', 'Cliente Fernando Antunes Panissi Da Silva Sauro(338)', NOW(), 'u'),('customer', 'board_id', '0', '', 'soroso', 'id_customer', '338', 'Cliente Fernando Antunes Panissi Da Silva Sauro(338)', NOW(), 'u'),('customer', 'ad_id', '0', '', 'soroso', 'id_customer', '338', 'Cliente Fernando Antunes Panissi Da Silva Sauro(338)', NOW(), 'u')
[25 Nov 2005 16:00] Valeriy Kravchuk
How many rows are there in that log table? If I understood you right, I have to fill the table with some rows, then start INSERT DELAYED and then, immediately, that ALTER TABLE .. RENAME, from the other session? 

Can you create a simplest possible test case, with data dump and something like a shell script to repeat the crash you describe?
[26 Nov 2005 10:17] Valeriy Kravchuk
PHP will be OK for a test case. Just don't forget to use mysqli to work with 5.0.x server.
[29 Nov 2005 10:35] Carlos Uldérico Cirello Filho
I'm having problems to create a sucessful test case. Please, hold on a day more please.
[29 Nov 2005 15:30] Valeriy Kravchuk
Please, send the test case when it will be ready. There is no need to hurry.
[30 Nov 2005 17:08] Carlos Uldérico Cirello Filho
Hi Valeriy Kravchuk,

I've attached the test case!

Make good use of it! I did in Bash... it seemed easier to do in such languagues, does it help?

Regards,

Carlos Uldérico
[4 Dec 2005 14:44] Valeriy Kravchuk
Thank you for a test case. I modified it slightly (other username and database name) and run as follows:

[openxs@Fedora 5.0]$ bash 15140.sh
Starting the test case...
Creating test table - [openxs@Fedora 5.0]$ OK
Inserting delayed lines (each dot is a set of 10 lines)
.................You shall see an error of connection. Keep mysql.err open to see crashing going on.
Renaming table
Sleeping 5s - You'll see some errors from now on...
.ERROR 1146 (42S02) at line 2: Table 'test.test_for_crash' doesn't exist
.ERROR 1146 (42S02) at line 2: Table 'test.test_for_crash' doesn't exist
.ERROR 1146 (42S02) at line 2: Table 'test.test_for_crash' doesn't exist
.ERROR 1146 (42S02) at line 2: Table 'test.test_for_crash' doesn't exist
.ERROR 1146 (42S02) at line 2: Table 'test.test_for_crash' doesn't exist
Undoing name change
You shouldn't be seeing errors anylonger
................................................................................
................................................................................

Then I killed bash from the other terminal. But I had not got any crashes:

[openxs@Fedora 5.0]$ tail -10 var/Fedora.err
051204 17:02:39  InnoDB: Starting shutdown...
051204 17:02:41  InnoDB: Shutdown completed; log sequence number 0 48997821
051204 17:02:41 [Note] /home/openxs/dbs/5.0/libexec/mysqld: Shutdown complete

051204 17:02:41  mysqld ended

051204 17:22:37  mysqld started
051204 17:22:38  InnoDB: Started; log sequence number 0 48997821
051204 17:22:39 [Note] /home/openxs/dbs/5.0/libexec/mysqld: ready for connections.
Version: '5.0.17'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
[openxs@Fedora 5.0]$ cat 15140.sh
#!/bin/bash

echo "Starting the test case...";
bash thread1.sh &
bash thread2.sh &
[openxs@Fedora 5.0]$ uname -a
Linux Fedora 2.4.22-1.2115.nptl #1 Wed Oct 29 15:42:51 EST 2003 i686 i686 i386 GNU/Linux

I used latest 5.0.17-BK build (ChangeSet@1.2041, 2005-12-01 15:10:35-08:00).
[5 Dec 2005 0:57] Carlos Uldérico Cirello Filho
Well I runned it on a RH 9 with 2.4.20-8smp.

I'll try to run on another machine with a different version of Linux and reproduce test case in 5.0.16.

I'll also apply a complete DUMP and RELOAD of data into database as you once told me in the same machine keeping 5.0.16.

I shall send you a more detailed report of the proceedings. I might be facing an ill-done upgrade or a library conflict (perhaps?).

I shall response shortly and set this bug to closed eventually.
[6 Dec 2005 14:04] Carlos Uldérico Cirello Filho
I'm closing this bug because I didn't managed to reproduce the crash on the others scenarios I proposed.

Most likely an ill-done upgrade.

Sorry for troubling.
[23 Dec 2005 21:39] Carlos Uldérico Cirello Filho
I am reopening this bug sharing the discovery I made.

This week, I've upgraded the MySQL from 5.0.16-standard to 5.0.17-standard. It kept crashing. Reading further about statics buildings, I've decided to try 5.0.17-standard-static and it stopped crashing.

As it's been told, my box is a RedHat Linux 9 with 2.4.20-8smp kernel running on a P4 2.6 HT. 

Thus I can firmly assert that MySQL 5.0.15-17 ARE NOT FULLY COMPATIBLE with RH 9.

Thanks.
[26 Dec 2005 15:23] Valeriy Kravchuk
Please, specify, what exact RPM package of 5.0.17 gives you crash. Have you tried to repeat the crash on a fresh copy of 5.0.17? Is it repeatable with your previous test case?

I have RedHat 9 installed now, so I'll be able to repeat the problem in the same exact environment. Just need this additional information.
[26 Dec 2005 15:32] Carlos Uldérico Cirello Filho
Well, I didn't run the test case I sent you last time. Anyway, I DID by hand what the TC would do. INSERT DELAYED on a table, trying RENAME IT right after (at this point, server crashes).

I used mysql-standard-5.0.17-linux-i686-glibc23.tar.gz on RH 9:

# rpm -qa | grep -i kernel
kernel-smp-2.4.20-8
kernel-2.4.20-8
kernel-pcmcia-cs-3.1.31-13
kernel-source-2.4.20-8

# rpm -qa | grep -i glibc
glibc-kernheaders-2.4-8.10
glibc-common-2.3.2-11.9
glibc-2.3.2-11.9
glibc-devel-2.3.2-11.9

# cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping        : 9
cpu MHz         : 2598.795
cache size      : 512 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 5190.45

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping        : 9
cpu MHz         : 2598.795
cache size      : 512 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 5190.45

(it's a P4 HT).

# free -m
             total       used       free     shared    buffers     cached
Mem:          2015       1943         71          0        234        862
-/+ buffers/cache:        846       1168
Swap:         2047          9       2038

# uname -a
Linux #########.#####.###### 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 i686 i686 i386 GNU/Linux

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hdc1            1012M  326M  635M  34% /
/dev/hdc6             4.0G  437M  3.4G  12% /home
none                 1008M     0 1008M   0% /dev/shm
/dev/hdc7             4.0G  1.9G  2.0G  49% /usr
/dev/hdc5              15G   10G  4.1G  72% /var
/dev/hdc2              15G  225M   14G   2% /var/log
/dev/hdc3              15G  6.2G  7.9G  44% /var/lib/mysql
[...]
[26 Dec 2005 15:34] Carlos Uldérico Cirello Filho
I've tried on 5.0.16 a fresh reload of database. Yet it didn't worked. I've not tested it on 5.0.17. Do you really need such information? I may take a day or two to test it for you.
[4 Jan 2006 11:23] Valeriy Kravchuk
Do not spend your time testing on 5.0.17. Now it makes sense to use 5.0.18 for testing, and I'll run your test case on a fresh installation of 5.0.18 on RH9, as soon as download finished.

Coming back to your previous message:

"Anyway, I DID by hand what the TC would do. INSERT DELAYED on a table, trying RENAME IT right after (at this point, server crashes)."

Looks like you managed to find a simpler way to reproduce the problem. Can you, please, send the exact sequence of statements?
[4 Jan 2006 14:29] Valeriy Kravchuk
I've just installed the latest available 5.0.18 on my RH9, and was able to get the crash with your test case:

[openxs@redhat9 mysql-standard-5.0.18-linux-i686-glibc23]$ bash 15140.sh
Starting the test case...
Creating test table - [openxs@redhat9 mysql-standard-5.0.18-linux-i686-glibc23]$
 OK
Inserting delayed lines (each dot is a set of 10 lines)
..................You shall see an error of connection. Keep mysql.err open to see crashing going on.
Renaming table
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
...

In the error log I've got:

060104 15:37:32  InnoDB: Started; log sequence number 0 0
060104 15:37:32 [Note] /home/openxs/mysql-standard-5.0.18-linux-i686-glibc23/bin
/mysqld: ready for connections.
Version: '5.0.18-standard'  socket: '/tmp/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=1
max_connections=100
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 225791 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x8a2b0c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x41f5afd4, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81536e8
0x40042618
0x4003ddea
0x4202726c
0x4003ddea
0x81af17f
0x4003d2b6
0x420de407
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow
instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x89f5998 = HS_for_crash
thd->thread_id=4
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
060104 16:20:23  mysqld restarted
060104 16:20:24  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060104 16:20:25  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 0 43655.
InnoDB: Doing recovery: scanned up to log sequence number 0 43655
060104 16:20:25  InnoDB: Started; log sequence number 0 43655
060104 16:20:25 [Note] /home/openxs/mysql-standard-5.0.18-linux-i686-glibc23/bin
/mysqld: ready for connections.
Version: '5.0.18-standard'  socket: '/tmp/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)

So, it really looks like a Red Hat 9-specific bug in MySQL binaries up to the latest 5.0.18 version.
[4 Jan 2006 14:46] Carlos Uldérico Cirello Filho
Well, it seems I wasn't so crazy, was I? :) 

Try static standard version... It'll not crash. It's a nice guess comparing them.

Thank you!
[12 Jun 2006 8:57] Ingo Strüwing
I do not have a Redhat9 system available. I tried on my home system (Debian) and on production.mysql.com (Fedora Core 2). But both have 2.6 kernels. It seems that none of our build machines is Redhat9. So it might be posible that we don't support it officially. Anyway, I don't know of any Redhat 9 system accessible by me.

I fixed a couple of INSERT DELAYED bugs with the patch for Bug# 16218. It is not yet approved and hence not part of any released version nor in any source repository. I suggest you wait until this is done and try again.

The stack backtraces above don't help. The first one comes from a select statement, which is not part of the test case. The other ones don't contain symbols. They can be retrieved with an exactly matching binary at hand. If you want to help, next time after a crash please follow the instructions at the bottom of the backtrace.

Regards, Ingo