Bug #4552 | mysql crash from load data replication | ||
---|---|---|---|
Submitted: | 14 Jul 2004 21:49 | Modified: | 7 Aug 2004 14:11 |
Reporter: | Dathan Pattishall | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 4.0.18 | OS: | Linux (Linux dbuser1.gbxsc.friendster.c) |
Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
[14 Jul 2004 21:49]
Dathan Pattishall
[15 Jul 2004 0:13]
Guilhem Bichot
Dathan, my dear customer, "it killed all the slaves" is not sufficient info for me to have an idea of what happened. Please, could you copy-paste here the error message(s) that was possibly printed into the slave's error log or SHOW SLAVE STATUS? Thanks! Guilhem
[15 Jul 2004 0:17]
Dathan Pattishall
This was the msg mysql left 040713 3:18:52 Slave I/O thread: connected to master 'dbslave@dbmstr1:3306', replication started in log 'db-bin.078' at position 515 mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=134217728 read_buffer_size=4190208 max_used_connections=46 max_connections=300 threads_connected=11 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 4430669 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. this is a 64 bit sever with 8 GB of ram
[15 Jul 2004 0:20]
Dathan Pattishall
Additionally to fix the problem I deleted dbslave-bin.* and relay-log.info and incremented the master.info to past the load data contiguous blocks of data, to the next mysql statement.
[17 Jul 2004 23:50]
Guilhem Bichot
Testing now, with an InnoDB table of the suggested structure, loading 14771608 random rows in it... (takes time)
[18 Jul 2004 0:50]
Guilhem Bichot
Hi Dathan. So I generated a file with 14771608 rows, and loaded it in an InnoDB table having the same structure as your 'photos', on the master. It took quite a while, and finally the _master_ crashed when it was almost done with the LOAD DATA; the file to load was 470 MB and when the master crashed, the slave's temporary file (for replication of LOAD DATA INFILE) was 435 MB. Master said this (it's long to read, but interesting): after a while of LOAD DATA it started to print: 040718 0:22:27 InnoDB: WARNING: over 4 / 5 of the buffer pool is occupied by InnoDB: lock heaps or the adaptive hash index! Check that your InnoDB: transactions do not set too many row locks. InnoDB: Your buffer pool size is 16 MB. Maybe you should make InnoDB: the buffer pool bigger? InnoDB: Starting the InnoDB Monitor to print diagnostics, including InnoDB: lock heap and hash index sizes. ===================================== 040718 0:22:41 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 16 seconds ---------- SEMAPHORES ---------- <I cut, there are many rows> ---------------------------- END OF INNODB MONITOR OUTPUT ============================ 040718 0:37:38 InnoDB: ERROR: over 9 / 10 of the buffer pool is occupied by InnoDB: lock heaps or the adaptive hash index! Check that your InnoDB: transactions do not set too many row locks. InnoDB: Your buffer pool size is 16 MB. Maybe you should make InnoDB: the buffer pool bigger? InnoDB: We intentionally generate a seg fault to print a stack trace InnoDB: on Linux! 040718 0:37:38InnoDB: Assertion failure in thread 196621 in file buf0lru.c line 238 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. See section 6.1 of InnoDB: http://www.innodb.com/ibman.php about forcing recovery. InnoDB: Thread 81926 stopped in file os0file.c line 2904 mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=8388600 read_buffer_size=131072 max_used_connections=1 max_connections=100 threads_connected=2 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 225791 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0x8a3e520 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0x44009138, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x81124b8 0x4002d4ec 0x829dd5e 0x829947d 0x8299e6d 0x829ff5e 0x82a02af 0x82997d4 0x827fd25 0x8260e34 0x82612b8 0x82613e7 0x826155a 0x8261667 0x8262b4f 0x8194008 0x8156556 0x81a4048 0x81a377f 0x8123d5e 0x8125a54 0x81209f8 0x81203c1 0x811f993 0x40028600 0x40308a37 New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x8a84090 = LOAD DATA INFILE "/tmp/photos.txt" INTO TABLE photos thd->thread_id=3 The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash. Sure, my buffer pool is 16MB, whereas your slave has 1.7 GB. But it shows that your slave's crash *could* have been caused by an assertion failure in InnoDB. Please, are you sure the error log snippet you posted is complete? It looks unusually short (it does not show a stack trace, which is usually the case when SIGSEGV). And which package do you use (standard, max, etc, built by us?) ? Thanks for any precision you could bring. Guilhem
[19 Jul 2004 20:23]
Dathan Pattishall
This is a package built by mysql. This was all that I got out of the error log. What I can do is try to duplcate the problem and enable core. But, that is not likly to happen since I ran out of servers to test on.
[7 Aug 2004 14:11]
Guilhem Bichot
Can't repeat, unfortunately. If this happens again, please mark it here. It may also be a good idea to upgrade to 4.0.20.