Bug #14928 | Random server crashes on query that has DISTINCT & IN syntax | ||
---|---|---|---|
Submitted: | 14 Nov 2005 19:57 | Modified: | 16 Nov 2005 17:02 |
Reporter: | Nathan Goff | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.15 | OS: | Linux (Red Hat EWS 3u4) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[14 Nov 2005 19:57]
Nathan Goff
[15 Nov 2005 9:49]
Valeriy Kravchuk
Thank you for a problem report. Would you, please, send the SHOW CREATE TABLE results for the tables used in the problematic query (Build and Sod). How did you perfromed the upgrade (descibe all steps, please)? Your my.cnf file content and hardware description may be of some use too.
[15 Nov 2005 16:42]
Heikki Tuuri
Hi! The stack trace has similarity to the ones at: http://bugs.mysql.com/bug.php?id=14747 Also in that bug it was a 'complex' SQL query that was causing the crash. But in the stack traces there are crashes in three different places. That suggests a memory corruption bug. Can you run a debug version of mysqld inside gdb, so that we would get a detailed stack trace? Build mysql like this: CFLAGS="-O3 -g" CXXFLAGS="-O3 -g" ./configure make Regards, Heikki
[15 Nov 2005 17:26]
Heikki Tuuri
In 5.0.15, btr_search_build_page_hash_index+510 seems to be about the place marked with #### below: } /* Check that the values for hash index build are sensible */ if (n_fields + n_bytes == 0) { return; } #### if (dict_index_get_n_unique_in_tree(index) < n_fields || (dict_index_get_n_unique_in_tree(index) == n_fields && n_bytes > 0)) { return; }
[15 Nov 2005 17:29]
Heikki Tuuri
Note that also in the other bug report DISTINCT was involved. In theory, this might be massive memory corruption that happens to crash mysqld inside the btr0sea.c routines. But we should get a gdb trace to be sure where it exactly crashes.
[15 Nov 2005 23:06]
Nathan Goff
To get a gdb trace would we just run the pre-compiled debug version of 5.0.15? I will work on getting the other requested info.
[15 Nov 2005 23:31]
Heikki Tuuri
Nathan, I guess that a pre-compiled mysqld debug binary can be used in gdb. But what I usually do, I build mysqld from the source like this: CFLAGS="-O3 -g" CXXFLAGS="-O3 -g" ./configure make The mysqld binary is built to directory /sql in the source tree. Then you have the flexibility to add patches to the source if we come up with something. If you are able to crash it under gdb, please do: (gdb) bt full (gdb) info threads and if some of the threads are in an 'interesting' function, do: (gdb) thread <number of the thread> (gdb) bt full etc. Post the WHOLE output here, including all mysqld output. Regards, Heikki
[15 Nov 2005 23:47]
Nathan Goff
SHOW CREATE TABLE Sod; | Sod |CREATE TABLE `Sod` ( `BuildId` int(11) NOT NULL default '0', `PlatformId` varchar(32) NOT NULL default '', `MessageId` int(11) NOT NULL default '0', `FromSite` varchar(32) NOT NULL default '', `UserId` varchar(32) NOT NULL default '', `PublishFromLocation` varchar(255) NOT NULL default '', `Comments` longtext NOT NULL, `PublishDate` int(11) NOT NULL default '0', `AutoChangeLog` set('y','n') NOT NULL default 'n', `IgnoreSodFiles` tinyint(4) NOT NULL default '0', `ArchiveSite` varchar(32) NOT NULL default '', PRIMARY KEY (`BuildId`), KEY `PlatformId` (`PlatformId`), KEY `FromSite` (`FromSite`), KEY `UserId` (`UserId`), KEY `MessageId` (`MessageId`), CONSTRAINT `Sod_ibfk_4` FOREIGN KEY (`BuildId`) REFERENCES `Build` (`BuildId`), CONSTRAINT `Sod_ibfk_5` FOREIGN KEY (`PlatformId`) REFERENCES `Platform` (`PlatformId`), CONSTRAINT `Sod_ibfk_6` FOREIGN KEY (`FromSite`) REFERENCES `Site` (`SiteName`), CONSTRAINT `Sod_ibfk_7` FOREIGN KEY (`UserId`) REFERENCES `DomainUser` (`UserId`), CONSTRAINT `Sod_ibfk_8` FOREIGN KEY (`MessageId`) REFERENCES `Message` (`MessageId`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | SHOW CREATE TABLE Build; | Build |CREATE TABLE `Build` ( `BuildId` int(11) NOT NULL auto_increment, `UnitName` varchar(32) NOT NULL default '', `UserId` varchar(32) NOT NULL default '', `CompileTypeName` varchar(32) NOT NULL default '', `BuildStart` datetime default NULL, `BuildEnd` datetime default NULL, `BuildComments` text, `ConfigSpec` longtext, `SrcViewType` set('static','dynamic') default 'static', `ReleaseName` varchar(32) default NULL, `ProcessId` int(11) default NULL, `BuildLog` longtext, `SodPublish` set('y','n') default 'n', `BuildStored` longtext, `WinUpdateFrom` varchar(128) NOT NULL default '', `WinUpdateTo` varchar(128) NOT NULL default '', `UnixUpdateFrom` varchar(128) NOT NULL default '', `UnixUpdateTo` varchar(128) NOT NULL default '', `IsUpdate` set('y','n') NOT NULL default 'n', `RestartQueue` set('y','n') NOT NULL default 'n', `NotifyUsers` set('y','n') NOT NULL default 'y', `CreateInstall` set('y','n') NOT NULL default 'n', `SiteName` varchar(32) NOT NULL default '', `EmailReportType` set('small','medium','large','xlarge') default NULL, `MainwinVer` varchar(64) NOT NULL default '', `ReleasedBuild` tinyint(4) NOT NULL default '0', PRIMARY KEY (`BuildId`), KEY `CompileTypeName_index` (`CompileTypeName`), KEY `UserId_index` (`UserId`), KEY `UnitName_index` (`UnitName`), KEY `BuildId_UnitName` (`BuildId`,`UnitName`), KEY `BuildId` (`BuildId`,`UserId`), CONSTRAINT `0_973` FOREIGN KEY (`CompileTypeName`) REFERENCES `CompileType` (`CompileTypeName`), CONSTRAINT `0_974` FOREIGN KEY (`UnitName`) REFERENCES `Unit` (`UnitName`), CONSTRAINT `0_975` FOREIGN KEY (`UserId`) REFERENCES `DomainUser` (`UserId`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
[15 Nov 2005 23:53]
Nathan Goff
my.cnf # Example MySQL config file for large systems. # # This is for a large system with memory = 512M where the system runs mainly # MySQL. # # You can copy this file to # /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options (in this # installation this directory is /usr/local/mysql/data) or # ~/.my.cnf to set user-specific options. # # In this file, you can use all long options that a program supports. # If you want to know which options a program supports, run the program # with the "--help" option. # The following options will be passed to all MySQL clients [client] #password = your_password port = 3306 socket = /tmp/mysql.sock # Here follows entries for some specific programs # The MySQL server [mysqld] user = gbsadmin datadir=/gbs/mysql_data lower_case_table_names = 0 replicate-do-db = gbs port = 3306 socket = /tmp/mysql.sock skip-locking key_buffer = 256M max_allowed_packet = 32M table_cache = 256 sort_buffer_size = 1M read_buffer_size = 1M read_rnd_buffer_size = 4M myisam_sort_buffer_size = 64M thread_cache = 8 query_cache_size= 16M # Try number of CPU's*2 for thread_concurrency thread_concurrency = 8 # Don't listen on a TCP/IP port at all. This can be a security enhancement, # if all processes that need to connect to mysqld run on the same host. # All interaction with mysqld must be made via Unix sockets or named pipes. # Note that using this option without enabling named pipes on Windows # (via the "enable-named-pipe" option) will render mysqld useless! # #skip-networking # Replication Master Server (default) # binary logging is required for replication log-bin # required unique id between 1 and 2^32 - 1 # defaults to 1 if master-host is not set # but will not function as a master if omitted server-id = 4 # Replication Slave (comment out master section to use this) # # To configure this host as a replication slave, you can choose between # two methods : # # 1) Use the CHANGE MASTER TO command (fully described in our manual) - # the syntax is: # # CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>, # MASTER_USER=<user>, MASTER_PASSWORD=<password> ; # # where you replace <host>, <user>, <password> by quoted strings and # <port> by the master's port number (3306 by default). # # Example: # # CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306, # MASTER_USER='joe', MASTER_PASSWORD='secret'; # # OR # # 2) Set the variables below. However, in case you choose this method, then # start replication for the first time (even unsuccessfully, for example # if you mistyped the password in master-password and the slave fails to # connect), the slave will create a master.info file, and any later # change in this file to the variables' values below will be ignored and # overridden by the content of the master.info file, unless you shutdown # the slave server, delete master.info and restart the slaver server. # For that reason, you may want to leave the lines below untouched # (commented) and instead use CHANGE MASTER TO (see above) # # required unique id between 2 and 2^32 - 1 # (and different from the master) # defaults to 2 if master-host is set # but will not function as a slave if omitted #server-id = 2 # # The replication master for this slave - required #master-host = <hostname> # # The username the slave will use for authentication when connecting # to the master - required #master-user = <username> # # The password the slave will authenticate with when connecting to # the master - required #master-password = <password> # # The port the master is listening on. # optional - defaults to 3306 #master-port = <port> # # binary logging - not required for slaves, but recommended #log-bin # Point the following paths to different dedicated disks #tmpdir = /tmp/ #log-update = /path-to-dedicated-directory/hostname # Uncomment the following if you are using BDB tables #bdb_cache_size = 64M #bdb_max_lock = 100000 # Uncomment the following if you are using InnoDB tables innodb_data_home_dir=/gbs/mysql_data innodb_data_file_path=ibdata1:2000M;ibdata2:10M:autoextend innodb_log_group_home_dir=/gbs/mysql_data innodb_log_arch_dir=/gbs/mysql_data # You can set .._buffer_pool_size up to 50 - 80 % # of RAM but beware of setting memory usage too high #innodb_buffer_pool_size = 256M #innodb_additional_mem_pool_size = 20M # Set .._log_file_size to 25 % of buffer pool size innodb_log_file_size=250M innodb_log_files_in_group=2 #innodb_log_buffer_size = 8M #innodb_flush_log_at_trx_commit = 1 #innodb_lock_wait_timeout = 50 [mysqldump] quick max_allowed_packet = 32M [mysql] no-auto-rehash # Remove the next comment character if you are not familiar with SQL #safe-updates [isamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
[16 Nov 2005 0:00]
Nathan Goff
Hardware: Dell PowerEdge 750 3.4 Ghz, 1MB Cache, 800Mhz FSB 2GB DDR, 400Mhz 160GB SATA, 7.2K RPM
[16 Nov 2005 0:06]
Nathan Goff
Upgrade process: We were running 4.1.11. Our datadir has always been outside the mysql tree. Here are the steps we took at all 8 of our production sites that run MySQL and replicate from a master database at one site. 1. Copy mysql-standard-5.0.15-linux-i686-glibc23.tar.gz to /usr/local 2. Shutdown current 4.1 server: service mysql stop 3. Unzip/untar 5.0 and create mysql link to new version in /usr/local 4. Update my.cnf from running as root to running as gbsadmin 5. chown/chgrp /usr/local/mysql and datadir location to gbsadmin/sddbuild instead of root 6. Start 5.0 server: service mysql start 7. Run mysql_fix_privilege_tables 8. Restart server: service mysql restart 9. Reset replication to new 5.0 master
[16 Nov 2005 0:34]
Nathan Goff
Good news! I don't think this is random. I am now able to reproduce it, but not with a known set of queries. It is reproducable with a series of clicks on our PHP gui. I haven't figured out the exact order that causes it, but I can click away and it will crash it. We use XML HTTP requests to dynamically populate dropdowns, so there are rapid queries going to the database. That is where we are seeing the crash. I need some help running mysql in gdb. I can get mysql compiled debug no problem, but how exactly do I bring it up in gbd. We have used MySQL for years and a lot of other open source software, but we have never had to debug something like this. It normally just works :) Any help you can provide on detailed command line commands to run to get mysqld running in gdb would be very helpful.
[16 Nov 2005 0:35]
Nathan Goff
Also...I did read the manual on gdb and mysqld, but it isn't very clear and has a lot of options. I am not sure what you want from it, so I don't know which way to run it.
[16 Nov 2005 1:08]
Nathan Goff
Disregard my previous messages about gdb. I figured it out. The help page wasn't as unclear as I thought once I understood gbd a little. I got it to crash in gdb and ran all the things you wanted me to run. I have no idea what you mean by "interesting" threads, but I have the debugger open still so if you see anything you want more info on send me the gdb commands and I can run them. I have attached a file with the gdb output in it.
[16 Nov 2005 1:09]
Nathan Goff
crashed mysqld in gdb, here is the output
Attachment: gdb_crash (application/octet-stream, text), 16.21 KiB.
[16 Nov 2005 7:42]
Heikki Tuuri
Nathan, thank you. We are now studying the gdb trace. Regards, Heikki
[16 Nov 2005 8:18]
Heikki Tuuri
Nathan, if gdb gives the crash on the right line, then 'index' probably is a garbage pointer, but not NULL. (Note that in the gdb trace some variables and function parameters are shown wrong because the compiler has optimized the code.) btr_search_drop_page_hash_index(): " if (block->index == NULL) { mem_analyze_corruption((byte*)block); ut_a(block->index != NULL); } while (!page_rec_is_supremum(rec)) { /* FIXME: in a mixed tree, not all records may have enough ordering fields: */ offsets = rec_get_offsets(rec, block->index, offsets, n_fields + (n_bytes > 0), &heap); " rec_get_offsets(): " ulint line) /* in: line number where called */ { ulint n; ulint size; ut_ad(rec); ut_ad(index); ut_ad(heap); if (UNIV_LIKELY(index->table->comp)) { #### crashes on this line " Please do: (gdb) print index (gdb) print *index (gdb) print index->table (gdb) print *index->table (gdb) frame 1 (gdb) print *block Since the crash is so easy to repeat, I believe we will know in 3 days what the bug is. The above looks like either block or index is corrupt. That suggests memory corruption caused by a MySQL or InnoDB bug. Regards, Heikki
[16 Nov 2005 11:34]
Heikki Tuuri
Vadim's latest research shows that: http://bugs.mysql.com/bug.php?id=14747 is almost certainly the same bug as this. block->index suddenly becomes NULL in the middle of dropping a page hash index and causes a crash. Regards, Heikki
[16 Nov 2005 11:46]
Heikki Tuuri
Nathan, this could be the other interesting thread: 14 Thread -1608717392 (LWP 29689) 0x08301edd in rec_copy_prefix_to_buf (rec=0x8a27c48 "", index=0x8a27c48, n_fields=1783, buf=0x0, buf_size=0xa01cd7f0) at rem0rec.c:1129 The bug looks like some other thread is suddenly causing block->index to become a NULL pointer. The other thread doing active work is the suspect. It is interesting to know what the other thread is doing. Regards, Heikki
[16 Nov 2005 14:22]
Heikki Tuuri
Nathan, this is the same bug as: http://bugs.mysql.com/bug.php?id=14747 A patch may be available in 30 minutes in that other bug report. Thank you for helping us to find this elusive bug. --Heikki
[16 Nov 2005 17:02]
Nathan Goff
No problem...thanks for helping out! Do you still want me to run those other gdb commands?
[16 Nov 2005 17:28]
Marko Mäkelä
Nathan, thanks, no need to run any more gdb commands. We're pretty sure that the bug you found is a duplicate of Bug #14747, which was just fixed. In case the patch attached to Bug #14747 does not work for you, please reopen this bug.