| Bug #31459 | Random crashes (stack trace inside) | ||
|---|---|---|---|
| Submitted: | 8 Oct 2007 17:19 | Modified: | 13 Sep 2008 7:52 |
| Reporter: | Samuel Vogel | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Server: General | Severity: | S1 (Critical) |
| Version: | 5.0.45 | OS: | Linux (2.6.18-4-686-bigmem) |
| Assigned to: | CPU Architecture: | Any | |
[8 Oct 2007 17:19]
Samuel Vogel
[8 Oct 2007 20:25]
Samuel Vogel
There is a typo in my submission. We did not upgrade to 5.0.24, but to 5.0.45 of course. Something that I do notice, is that the crashed do not occur at all when I have mysqld run with "--debug". Also it seems to me, that the slave startup on mysqld start does not work reliably. One time I start mysqld the slave starts fine, the other time it complains about not finding the relay_log. Strange.
[10 Oct 2007 9:55]
Samuel Vogel
I tried something different: This Server was acting as a slave before, so I removed all the Slave stuff from it. Also mydns DNS Server was running, but i relocated it to a different server. Unfortunately I still get the same behavior: # grep ' mysqld' /var/log/syslog Oct 10 07:06:29 h1314631 mysqld[29908]: 071010 7:06:29 [ERROR] /usr/sbin/mysqld: Table './blackcrawer@1-db/nuke_nsnst_config' is marked as crashed and last (automatic?) repair failed Oct 10 07:06:29 h1314631 mysqld[29908]: 071010 7:06:29 [ERROR] /usr/sbin/mysqld: Table './blackcrawer@1-db/nuke_nsnst_config' is marked as crashed and last (automatic?) repair failed Oct 10 07:06:34 h1314631 mysqld[29908]: 071010 7:06:34 [ERROR] /usr/sbin/mysqld: Table './blackcrawer@1-db/nuke_nsnst_config' is marked as crashed and last (automatic?) repair failed Oct 10 07:06:34 h1314631 mysqld[29908]: 071010 7:06:34 [ERROR] /usr/sbin/mysqld: Table './blackcrawer@1-db/nuke_nsnst_config' is marked as crashed and last (automatic?) repair failed Oct 10 11:06:51 h1314631 mysqld_safe[13279]: Number of processes running now: 0 Oct 10 11:06:51 h1314631 mysqld_safe[13281]: restarted Oct 10 11:06:51 h1314631 mysqld[13284]: 071010 11:06:51 [Warning] The syntax for replication startup options is deprecated and will be removed in MySQL 5.2. Please use 'CHANGE MASTER' instead. Oct 10 11:06:51 h1314631 mysqld[13284]: 071010 11:06:51 InnoDB: Database was not shut down normally! Oct 10 11:06:51 h1314631 mysqld[13284]: InnoDB: Starting crash recovery. Oct 10 11:06:51 h1314631 mysqld[13284]: InnoDB: Reading tablespace information from the .ibd files... Oct 10 11:16:13 h1314631 mysqld[13284]: InnoDB: Restoring possible half-written data pages from the doublewrite Oct 10 11:16:13 h1314631 mysqld[13284]: InnoDB: buffer... Oct 10 11:16:14 h1314631 mysqld[13284]: 071010 11:16:14 InnoDB: Starting log scan based on checkpoint at Oct 10 11:16:14 h1314631 mysqld[13284]: InnoDB: log sequence number 0 1371471155. Oct 10 11:16:14 h1314631 mysqld[13284]: InnoDB: Doing recovery: scanned up to log sequence number 0 1371471155 Oct 10 11:16:15 h1314631 mysqld[13284]: 071010 11:16:15 InnoDB: Started; log sequence number 0 1371471155 Oct 10 11:16:16 h1314631 mysqld[13284]: 071010 11:16:16 [Note] /usr/sbin/mysqld: ready for connections. Oct 10 11:16:16 h1314631 mysqld[13284]: Version: '5.0.45-Debian_1~bpo.1-debug' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Debian etch distribution The issue doesn't appear at all, when I run with --debug. What am I supposed to do?
[10 Oct 2007 19:13]
Samuel Vogel
My assumption about the problem not appearing with "--debug" was wrong. It just crashed on me two times with debug. I gzipped the debug trace from the last output and it is available here: http://alpha.kilu.de/mysqld.trace.gz This is how MySQLd was started: /usr/sbin/mysqld --basedir=/usr --datadir=/data/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock --debug=d,info,error,query,general,where:O,/data/mysqld.trace --myisam-recover I really hope somebody will acknowledge this problem!
[10 Oct 2007 19:14]
Samuel Vogel
Changed the name of the Bug.
[10 Oct 2007 20:26]
Samuel Vogel
Now I got a Signal 6 again: Crash info: http://alpha.kilu.de/mysqlss.gz Resolved Stack Trace: 0x81df50d handle_segfault + 797 0xb7fb6410 _end + -1350797504 0xb7d0ffb9 _end + -1353575703 0x854c807 safe_mutex_lock + 343 0x8271db0 _Z18mysql_print_statusv + 512 0x81dcf63 signal_hand + 595 0xb7f76240 _end + -1351060112 0xb7db14ae _end + -1352914978 The debug output: http://alpha.kilu.de/mysqld.trace.gz If any more infos are needed, please say so. Regards, Samy
[11 Oct 2007 11:07]
Samuel Vogel
I did compile MySQL Enterprise Ediditon 5.0.46. But the crashes do still appear, but I do notice that the InnoDB Recovery is much faster with the Enterprise edidition. As you can see in my logs it used to take about 10 minutes and the enterprise edition does it in approx. 3 minutes. Does anybody here have any suggestions for me?
[12 Oct 2007 23:44]
Samuel Vogel
I can actually somewhat reproduce the crashes now. When I run mysqldump, it crashes about one minute after mysqldump is started. It does this every single time. So it seems, that the problem may somehow be load related, even thou we have 2am right now, and not very man people are online at this time! I would really appreciate it, if some dev would show up here ;)
[28 Oct 2007 13:29]
Valeriy Kravchuk
Thank you for a detailed problem report. Please, send the results of: SHOW GLOBAL STATUS; SHOW GLOBAL VARIABLES; from your server, and the results of the following Linux commands: uname -a free
[28 Oct 2007 16:36]
Samuel Vogel
Hey,
Thanks for the reply, I will submit the requested information.
I have to admin thou, that the problem does not appear hardly ad all anymore, after I checked the slow-query-log and eliminated the user, that was causing almost all of the slow querys by not using indexes and thereby causing MySQL to go up to almost 100% CPU usage. He was using all MyISAM tables, if that helps.
The output of the MySQL commands can be found in the files.
# uname -a
Linux h1314631 2.6.18-4-686-bigmem #1 SMP Thu May 10 00:23:00 UTC 2007 i686 GNU/Linux
# free -m
total used free shared buffers cached
Mem: 4058 3832 225 0 5 1979
-/+ buffers/cache: 1847 2210
Swap: 2055 155 1900
The Server does not ever go under the line of about 130M of free Ram. Not even during the peak times. It seems like this is some kind of magic barrier.
I guess this problem won't be an easy one to solve, because I do not have a reproductible test case.
Regards,
Samy
[28 Oct 2007 16:37]
Samuel Vogel
output of the requested MySQL commands
Attachment: mysql output.txt (text/plain), 30.98 KiB.
[24 Jan 2008 12:26]
Sergey Kostyliov
Samuel, could you please say - did you use `replicate-rewrite-db' option in your first "slave" setup? To me your symptoms for a first setup looks like our http://bugs.mysql.com/bug.php?id=30056
[24 Jan 2008 15:31]
Samuel Vogel
No, I did not use "replicate-rewrite-db" in my setup.
[17 Feb 2008 15:45]
Valeriy Kravchuk
Please, try to repeat crashes with a newer version, 5.0.51a at least, an inform about the results.
[21 Feb 2008 18:17]
Samuel Vogel
Unfortunately I do not work for the same project anymore. We did trace down the problem thou: One of our users was running an ancestor research program which was doing exstensive un-indexed joins. We shut him down and the problem dissapeared almost. Against the remaining crashes (once a day or so) we could not find a solution.
[11 Aug 2008 17:03]
David Carr
Not sure whether this helps but the error starts to occur when the text gets to 2056 characters long (or thereabouts) I don't want to confuse the issue but there's definitely something very strange occurring because I can't recreate this behaviour on my old schema exactly. Its hard to say exactly when it works and doesn't but it seems to depend on which client I create the procedure in and the character encoding of the schema, database and client. Sorry that I can't pin it down any more - let me know if you need anything else. The previous example still holds for me though.
[13 Aug 2008 7:52]
Sveta Smirnova
David, thank you for the feedback. > Not sure whether this helps but the error starts to occur when the text gets > to 2056 characters long (or thereabouts) Do you have table definition and query error happens with? You say "t seems to depend on which client I create the procedure in and the character encoding of the schema, database and client." Do you have statistic which clients (versions in first turn) and character encoding do you use? Also please try current version 5.0.67 and check if problem still exists.
[13 Sep 2008 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
