Bug #21468 | InnoDB crash during recovery with corrupted data pages: XA bug? | ||
---|---|---|---|
Submitted: | 7 Aug 2006 3:19 | Modified: | 18 Jun 2010 12:47 |
Reporter: | Arjen Lentz | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.20,5.0.21,5.0.24,5.0.26 | OS: | Linux (Linux) |
Assigned to: | Heikki Tuuri | CPU Architecture: | Any |
Tags: | backport_050030SP1 |
[7 Aug 2006 3:19]
Arjen Lentz
[25 Sep 2006 14:37]
Heikki Tuuri
log0log.c in 5.0.25: /************************************************************************ Checks that there is enough free space in the log to start a new query step. Flushes the log buffer or makes a new checkpoint if necessary. NOTE: this function may only be called if the calling thread owns no synchronization objects! */ void log_check_margins(void) /*===================*/ { loop: log_flush_margin(); log_checkpoint_margin(); #ifdef UNIV_LOG_ARCHIVE log_archive_margin(); #endif /* UNIV_LOG_ARCHIVE */ mutex_enter(&(log_sys->mutex)); if (log_sys->check_flush_or_checkpoint) { mutex_exit(&(log_sys->mutex)); goto loop; } mutex_exit(&(log_sys->mutex)); }
[25 Sep 2006 14:39]
Heikki Tuuri
/******************************************************************** Tries to establish a big enough margin of free space in the log buffer, such that a new log entry can be catenated without an immediate need for a flush. */ static void log_flush_margin(void) /*==================*/ { ibool do_flush = FALSE; log_t* log = log_sys; dulint lsn; mutex_enter(&(log->mutex)); if (log->buf_free > log->max_buf_free) { if (log->n_pending_writes > 0) { /* A flush is running: hope that it will provide enough free space */ } else { do_flush = TRUE; lsn = log->lsn; } } mutex_exit(&(log->mutex)); if (do_flush) { log_write_up_to(lsn, LOG_NO_WAIT, FALSE); } } /******************************************************************** Tries to establish a big enough margin of free space in the log groups, such that a new log entry can be catenated without an immediate need for a checkpoint. NOTE: this function may only be called if the calling thread owns no synchronization objects! */ static void log_checkpoint_margin(void) /*=======================*/ { log_t* log = log_sys; ulint age; ulint checkpoint_age; ulint advance; dulint oldest_lsn; ibool sync; ibool checkpoint_sync; ibool do_checkpoint; ibool success; loop: sync = FALSE; checkpoint_sync = FALSE; do_checkpoint = FALSE; mutex_enter(&(log->mutex)); if (log->check_flush_or_checkpoint == FALSE) { mutex_exit(&(log->mutex)); return; } oldest_lsn = log_buf_pool_get_oldest_modification(); age = ut_dulint_minus(log->lsn, oldest_lsn); if (age > log->max_modified_age_sync) { /* A flush is urgent: we have to do a synchronous preflush */ sync = TRUE; advance = 2 * (age - log->max_modified_age_sync); } else if (age > log->max_modified_age_async) { /* A flush is not urgent: we do an asynchronous preflush */ advance = age - log->max_modified_age_async; } else { advance = 0; } checkpoint_age = ut_dulint_minus(log->lsn, log->last_checkpoint_lsn); ...
[25 Sep 2006 14:49]
Heikki Tuuri
Arjen, I am sorry for the delay in replying. I do not see how it can crash in that function. At the time of the rollback, all log data strcutures should already be initialized. Please show the START of the .err log. It can give us a hint why mysqld crashed in the first place. Regards, Heikki
[22 Oct 2006 20:38]
Mark Leith
Verifying after receiving this after starting MySQL on an internal instance, after the disk space had filled because of a 2.5Gb general query log. o rm'd the general query log o logged in to mysql and issued FLUSH LOGS (which hung) o quit, and restarted MySQL Got the following: 061022 22:25:37 mysqld started 061022 22:25:37 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... 061022 22:25:37 InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 0 1461296194. InnoDB: Doing recovery: scanned up to log sequence number 0 1461296598 InnoDB: Transaction 0 13217510 was in the XA prepared state. InnoDB: 4 transaction(s) which must be rolled back or cleaned up InnoDB: in total 3 row operations to undo InnoDB: Trx id counter is 0 13218048 061022 22:25:37 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 772403203, file name /opt/mysqlnetwork-0.7.3/mysql/data/mysql-network.000006 InnoDB: Starting in background the rollback of uncommitted transactions 061022 22:25:37 InnoDB: Rolling back trx with id 0 13217548, 1 rows to undo 061022 22:25:37 InnoDB: Started; log sequence number 0 1461296598 061022 22:25:37 [Note] Recovering after a crash using /opt/mysqlnetwork-0.7.3/mysql/data/mysql-network 061022 22:25:37 [Note] Starting crash recovery... 061022 22:25:37 InnoDB: Starting recovery for XA transactions... 061022 22:25:37 InnoDB: Transaction 0 13217510 in prepared state after recovery 061022 22:25:37 InnoDB: Transaction contains changes to 1 rows 061022 22:25:37 InnoDB: 1 transactions in prepared state after recovery 061022 22:25:37 [Note] Found 1 prepared transaction(s) in InnoDB mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=0 read_buffer_size=131072 max_used_connections=0 max_connections=100 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 217599 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=(nil) Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0xbffb054c, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x8177318 0x8687c8 0x8292c1a 0x8292c1a 0x82922a5 0x82e2de6 0x82e53a0 0x821d5ec 0x8214b85 0x81e7c05 0x81e793c 0x8179918 0x8177825 0x708e23 0x81014c1 New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash. 061022 22:25:37 mysqld ended [root@net-sup1 data]# ../bin/resolve_stack_dump -s /tmp/mysqld.sym -n /tmp/mysqld.stack 0x8177318 handle_segfault + 356 0x8687c8 (?) 0x8292c1a que_thr_step + 446 0x8292c1a que_thr_step + 446 0x82922a5 que_run_threads + 45 0x82e2de6 trx_general_rollback_for_mysql + 158 0x82e53a0 trx_rollback_for_mysql + 32 0x821d5ec _Z24innobase_rollback_by_xidP5xid_t + 68 0x8214b85 _Z10ha_recoverP7st_hash + 453 0x81e7c05 _ZN9MYSQL_LOG7recoverEP11st_io_cacheP28Format_description_log_event + 257 0x81e793c _ZN9MYSQL_LOG4openEPKc + 744 0x8179918 _Z22init_server_componentsv + 516 0x8177825 main + 417 0x708e23 (?) 0x81014c1 _start + 33
[23 Oct 2006 13:53]
Mark Leith
Hi Heikki, OK, so I grabbed the 5.0.24a "standard Linux" debug binary, and started that against the crashing datadir, and guess what... It "fixed" the issue. [root@net-sup1 bin]# gdb mysqld GNU gdb Red Hat Linux (6.3.0.0-1.63rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) set args --datadir=/opt/mysqlnetwork-0.7.3/mysql/data/ --user=mysql --basedir=/tmp/mysql-debug-5.0.24a-linux-i686-glibc23/ (gdb) run Starting program: /tmp/mysql-debug-5.0.24a-linux-i686-glibc23/bin/mysqld --datadir=/opt/mysqlnetwork-0.7.3/mysql/data/ --user=mysql --basedir=/tmp/mysql-debug-5.0.24a-linux-i686-glibc23/ [Thread debugging using libthread_db enabled] [New Thread -1208043840 (LWP 30450)] [New Thread -1219683408 (LWP 30453)] [New Thread -1230173264 (LWP 30454)] [New Thread -1240663120 (LWP 30455)] [New Thread -1251152976 (LWP 30456)] 061023 15:39:36 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... 061023 15:39:37 InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 0 1461296194. InnoDB: Doing recovery: scanned up to log sequence number 0 1461296598 InnoDB: Transaction 0 13217510 was in the XA prepared state. InnoDB: 4 transaction(s) which must be rolled back or cleaned up InnoDB: in total 3 row operations to undo InnoDB: Trx id counter is 0 13218048 061023 15:39:37 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 772403203, file name /opt/mysqlnetwork-0.7.3/mysql/data/mysql-network.000006 [New Thread -1266422864 (LWP 30457)] InnoDB: Starting in background the rollback of uncommitted transactions 061023 15:39:38 InnoDB: Rolling back trx with id 0 13217548, 1 rows to undo [New Thread -1276912720 (LWP 30458)] [New Thread -1287402576 (LWP 30459)] [New Thread -1297892432 (LWP 30460)] 061023 15:39:38 InnoDB: Started; log sequence number 0 1461296598 061023 15:39:38 InnoDB: Starting recovery for XA transactions... 061023 15:39:38 InnoDB: Transaction 0 13217510 in prepared state after recovery 061023 15:39:38 InnoDB: Transaction contains changes to 1 rows 061023 15:39:38 InnoDB: 1 transactions in prepared state after recovery 061023 15:39:38 [Note] Found 1 prepared transaction(s) in InnoDB 061023 15:39:38 [Note] ignore xid 'MySQLXid\1\0\0\0\37bc\2\0\0\0\0' 061023 15:39:38 [Warning] Found 1 prepared XA transactions [New Thread -1316774992 (LWP 30461)] InnoDB: Rolling back of trx id 0 13217548 completed 061023 15:39:38 InnoDB: Rolling back trx with id 0 13217518, 1 rows to undo 061023 15:39:38 [Note] /tmp/mysql-debug-5.0.24a-linux-i686-glibc23/bin/mysqld: ready for connections. Version: '5.0.24a-debug' socket: '/tmp/mysql.sock' port: 3306 MySQL Community Edition - Debug (GPL) InnoDB: Rolling back of trx id 0 13217518 completed 061023 15:39:38 InnoDB: Rolling back trx with id 0 13217517, 1 rows to undo InnoDB: Rolling back of trx id 0 13217517 completed 061023 15:39:38 InnoDB: Rollback of non-prepared transactions completed [Thread -1266422864 (zombie) exited] [New Thread -1316975696 (LWP 30466)] [New Thread -1317176400 (LWP 30520)] [Thread -1317176400 (LWP 30520) exited] [New Thread -1317176400 (LWP 30522)] [Thread -1317176400 (LWP 30522) exited] [New Thread -1317377104 (LWP 30523)] 061023 15:41:51 [Note] /tmp/mysql-debug-5.0.24a-linux-i686-glibc23/bin/mysqld: Normal shutdown Program received signal SIGUSR1, User defined signal 1. [Switching to Thread -1208043840 (LWP 30450)] 0x006db7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb) (gdb) quit The program is running. Exit anyway? (y or n) y [root@net-sup1 bin]# Now I can start the server "normally" without any issues. I'm going to leave this server running with the general query log going again - in the hopes that it fills the disk again and recreates this bug (again). I hate when debug binary fix issues that you are trying to debug further :)
[28 Oct 2006 11:56]
MySQL Verification Team
seen after mysql died due to out of disk space. then cleaned up, and saw this crash
Attachment: crash_on_startup_after_freeing_disk_space.txt (plain/text, text), 3.93 KiB.
[28 Oct 2006 12:00]
MySQL Verification Team
after starting mysql-5.0-BK -debug and specifying the datadir of the 5.0.26, it worked fine (as Leith confirmed). 061028 14:03:13 InnoDB: Rollback of non-prepared transactions completed 061028 14:03:13 InnoDB: Started; log sequence number 17 1586165220 061028 14:03:13 InnoDB: Starting recovery for XA transactions... 061028 14:03:13 InnoDB: Transaction 0 12713584 in prepared state after recovery 061028 14:03:13 InnoDB: Transaction contains changes to 1 rows 061028 14:03:13 InnoDB: 1 transactions in prepared state after recovery 061028 14:03:13 [Note] Found 1 prepared transaction(s) in InnoDB 061028 14:03:13 [Note] rollback xid 'MySQLXid\1\0\0\0\354\0\0\0\0\0\0\0' 061028 14:03:13 [Note] /home/sbester/server/5.0/mysql-5.0-bk/bin/mysqld: ready for connections. Version: '5.0.28-debug-log' socket: '/tmp/mysql.sock' port: 3306 yes
[8 Nov 2006 15:24]
Heikki Tuuri
All, maybe the reason why it did not crash in the second try was that MySQL 'ignored' a prepared transaction. 061023 15:39:38 InnoDB: Transaction 0 13217510 in prepared state after recovery 061023 15:39:38 InnoDB: Transaction contains changes to 1 rows 061023 15:39:38 InnoDB: 1 transactions in prepared state after recovery 061023 15:39:38 [Note] Found 1 prepared transaction(s) in InnoDB 061023 15:39:38 [Note] ignore xid 'MySQLXid\1\0\0\0\37bc\2\0\0\0\0' This might be a bug in XA/InnoDB code. Regards, Heikki
[23 Nov 2006 11:41]
Heikki Tuuri
Probably the same bug as http://bugs.mysql.com/bug.php?id=23843
[23 Nov 2006 12:15]
Heikki Tuuri
This bug is probably the same as in the other bug report. No InnoDB 'session' object when we roll back a prepared transaction in XA recovery.
[23 Nov 2006 12:40]
Heikki Tuuri
Now it is able to roll back a prepared transaction in XA recovery. heikki@127:~/mysql-5.0/sql$ gdb mysqld GNU gdb 6.3-debian Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) run Starting program: /home/heikki/mysql-5.0/sql/mysqld [Thread debugging using libthread_db enabled] [New Thread 1075726944 (LWP 14162)] 061123 14:39:43 [Warning] Changed limits: max_open_files: 1024 max_connections: 886 table_cache: 64 [New Thread 1129249712 (LWP 14165)] [New Thread 1137638320 (LWP 14166)] [New Thread 1146026928 (LWP 14167)] [New Thread 1154415536 (LWP 14168)] 061123 14:39:44 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... 061123 14:39:44 InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 0 189019. InnoDB: Doing recovery: scanned up to log sequence number 0 189019 InnoDB: Transaction 0 2653 was in the XA prepared state. InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 0 3072 InnoDB: Last MySQL binlog file position 0 15099, file name ./1.000001 [New Thread 1167584176 (LWP 14169)] [New Thread 1175972784 (LWP 14170)] [New Thread 1184361392 (LWP 14171)] [New Thread 1192750000 (LWP 14172)] 061123 14:39:44 InnoDB: Started; log sequence number 0 189019 InnoDB: Starting in background the rollback of uncommitted transactions 061123 14:39:44 InnoDB: Rollback of non-prepared transactions completed 061123 14:39:44 InnoDB: Starting recovery for XA transactions... 061123 14:39:44 InnoDB: Transaction 0 2653 in prepared state after recovery 061123 14:39:44 InnoDB: Transaction contains changes to 1 rows 061123 14:39:44 InnoDB: 1 transactions in prepared state after recovery 061123 14:39:44 [Note] Found 1 prepared transaction(s) in InnoDB 061123 14:39:44 [Note] rollback xid 'MySQLXid\1\0\0\0]\0\0\0\0\0\0\0' [Thread 1167584176 (zombie) exited] [New Thread 1216601008 (LWP 14173)] [New Thread 1216797616 (LWP 14174)] 061123 14:39:44 [Note] /home/heikki/mysql-5.0/sql/mysqld: ready for connections. Version: '5.0.32-debug-log' socket: '/home/heikki/bugsocket' port: 3307 Source distribution
[23 Nov 2006 13:57]
Heikki Tuuri
A patch over 5.0.32: heikki@127:~/mysql-5.0/innodb$ svn diff -r 1043:1050 Index: trx/trx0roll.c =================================================================== --- trx/trx0roll.c (revision 1043) +++ trx/trx0roll.c (revision 1050) @@ -129,9 +129,23 @@ } trx->op_info = "rollback"; + + /* If we are doing the XA recovery of prepared transactions, then + the transaction object does not have an InnoDB session object, and we + set a dummy session that we use for all MySQL transactions. */ + if (trx->sess == NULL) { + /* Open a dummy session */ + + if (!trx_dummy_sess) { + trx_dummy_sess = sess_open(); + } + + trx->sess = trx_dummy_sess; + } + err = trx_general_rollback_for_mysql(trx, FALSE, NULL); - + trx->op_info = ""; return(err); Index: trx/trx0trx.c =================================================================== --- trx/trx0trx.c (revision 1043) +++ trx/trx0trx.c (revision 1050) @@ -1601,7 +1601,21 @@ ut_a(trx); trx->op_info = "committing"; + + /* If we are doing the XA recovery of prepared transactions, then + the transaction object does not have an InnoDB session object, and we + set the dummy session that we use for all MySQL transactions. */ + if (trx->sess == NULL) { + /* Open a dummy session */ + + if (!trx_dummy_sess) { + trx_dummy_sess = sess_open(); + } + + trx->sess = trx_dummy_sess; + } + trx_start_if_not_started(trx); mutex_enter(&kernel_mutex);
[24 Nov 2006 4:49]
Dag Stenstad
I hit this bug on 5.0.22 (Debian Sarge, Backports.org, 5.0.22-1bpo1). I tried 5.0.24 and 5.0.27 without success. Compiling 5.0.27 with this patch repaired the problem, and I was able to use 5.0.22 again.
[29 Nov 2006 13:14]
Heikki Tuuri
The patches should actually be wrapped in mutex_enter(&kernel_mutex) ... mutex_exit(&kernel_mutex). But they will work also without that. This is now fixed in latest Innobase Oy 5.0 and 5.1 svn trees.
[20 Dec 2006 10:31]
Joerg Bruehe
Fix is included in the 5.0.32 release build already, but not yet merged to main trees - please reset status after mentioning it for 5.0.32.
[21 Dec 2006 14:00]
Joerg Bruehe
Patch is now pushed to main 5.0 tree and to 5.1, will be in 5.1.15.
[21 Dec 2006 19:33]
Paul DuBois
Noted in 5.0.32, 5.1.15 changelogs. InnoDB crashed while performing XA recovery of prepared transactions.
[5 May 2010 15:10]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:24]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 6:01]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:30]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:57]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:49]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[17 Jun 2010 12:04]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:48]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:31]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)