Bug #60720 Mysqld crashes during rollback of uncommited transactions
Submitted: 1 Apr 2011 9:26 Modified: 1 May 2011 11:06
Reporter: Ingar Pedersen Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.1.49 OS:Windows (server 2003)
Assigned to: CPU Architecture:Any
Tags: crash, failing assertion, rollback

[1 Apr 2011 9:26] Ingar Pedersen
Description:
This morning our MySql server had crashed for unknown reason. When trying to restart the service it continued to crash. The error log attached below.

Tried starting with innodb_force_recovery=4 and 3, and it started fine, but with =2 the crash came back (this is the log below).

We have run hardware diagnostics etc and found no problem whatsoever.

errorlog:

110401 11:13:00 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: Log scan progressed past the checkpoint lsn 273 3840732943
110401 11:13:01  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 273 3845975552
InnoDB: Doing recovery: scanned up to log sequence number 273 3846852649
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 7 3419280128
110401 11:13:04  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 1059994, file name .\KILLKENNY-bin.000001
InnoDB: Starting in background the rollback of uncommitted transactions
110401 11:13:05  InnoDB: Rolling back trx with id 7 3419279839, 1 rows to undo
110401 11:13:05  InnoDB: Assertion failure in thread 4580 in file .\fsp\fsp0fsp.c line 1593
InnoDB: Failing assertion: frag_n_used > 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html
InnoDB: about forcing recovery.
110401 11:13:05 - mysqld got exception 0xc0000005 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=135266304
read_buffer_size=65536
max_used_connections=0
max_threads=800
threads_connected=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 393146 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
005DED2B    mysqld.exe!fsp_free_page()[fsp0fsp.c:1593]
005DF679    mysqld.exe!fseg_free_page_low()[fsp0fsp.c:3025]
005DF7DD    mysqld.exe!fseg_free_page()[fsp0fsp.c:3116]
00611ECC    mysqld.exe!btr_page_free_low()[btr0btr.c:466]
00611EFE    mysqld.exe!btr_page_free()[btr0btr.c:486]
00613991    mysqld.exe!btr_compress()[btr0btr.c:2210]
005D85AF    mysqld.exe!btr_cur_compress_if_useful()[btr0cur.c:2426]
005DAD80    mysqld.exe!btr_cur_pessimistic_delete()[btr0cur.c:2639]
00626F49    mysqld.exe!row_undo_ins_remove_clust_rec()[row0uins.c:89]
0062724B    mysqld.exe!row_undo_ins()[row0uins.c:305]
0061DE08    mysqld.exe!row_undo()[row0undo.c:280]
0061DE7F    mysqld.exe!row_undo_step()[row0undo.c:327]
005ED385    mysqld.exe!que_thr_step()[que0que.c:1254]
005ED953    mysqld.exe!que_run_threads_low()[que0que.c:1319]
005EDACA    mysqld.exe!que_run_threads()[que0que.c:1357]
005C6AFF    mysqld.exe!trx_rollback_or_clean_all_without_sess()[trx0roll.c:540]
7C82FB24    ntdll.dll!CsrNewThread()
3B3D0000    
3BD10304    
1C68009D    
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
InnoDB: Thread 3320 stopped in file .\os\os0sync.c line 574
InnoDB: Thread 7420 stopped in file .\os\os0sync.c line 271

How to repeat:
Repeats every time on our production server, but not on the development.
[1 Apr 2011 10:01] Valeriy Kravchuk
I do not know why server crashed initially (upload entire error log if you want), but this crash happened because of assertion failure in the fsp_free_page() function. The page in question can not be freed as its internal structure is corrupted. Thus you have to do forced recovery.

The original reason for the problem can be memory corruption or disk corruption. It can be a bug also.
[1 Apr 2011 10:56] Ingar Pedersen
We also concluded that the original reason was a corruption of some sort, however it now also crashes during the recovery so it is impossible to get the database running again. How do we recover from the forced recovery that crash the server?
[1 Apr 2011 11:06] Valeriy Kravchuk
You had written:

"Tried starting with innodb_force_recovery=4 and 3, and it started fine"

So, to recover start with the smallest value of innodb_force_recovery, dump all your data and then re-initiate the instance and load data back. 

If this does not work and you can not make a dump of current data, you need some backup. What else?
[1 May 2011 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".