Bug #53499 purge thread is active during shutdown, assert buf/buf0buf.c line 4115
Submitted: 7 May 2010 16:13 Modified: 24 Jun 2010 21:08
Reporter: Mikhail Izioumtchenko Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S2 (Serious)
Version:mysql-trunk-innodb OS:Any
Assigned to: Sunny Bains CPU Architecture:Any

[7 May 2010 16:13] Mikhail Izioumtchenko
Description:
start mysqld - start workload - kill mysqld - restart mysqld using innodb-purge-threads=1 - normal shutdown at once, asserts in buf/buf0buf.c line 4115.
Investigation showed that the assert condition is holding in the coredump
so there must be a race, indeed the purge thread was active at the moment.

How to repeat:
see description. The sequence of events is very unlikely in my tests
at the moment so I'm not sure about the reproducibility.

Suggested fix:
quiesce the purge thread before the shutdown somehow
[8 May 2010 0:42] Sunny Bains
The bug is because the function logs_empty_and_mark_files_at_shutdown() only waits to
 check for the state of the master thread not the purge threa. The fix is to check the
state of the purge sub-system too.

-sunny
[10 May 2010 19:11] Mikhail Izioumtchenko
the fix is incomplete, I've seen the shutdown hang with innodb-purge-threads=1.

 #1  0x00000000009375fa in os_thread_sleep (tm=100000)
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/os/os0thread.c:286
 #2  0x0000000000921b4b in logs_empty_and_mark_files_at_shutdown ()
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/log/log0log.c:3091
 #3  0x0000000000881081 in innobase_shutdown_for_mysql ()
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/srv/srv0start.c:1969

loops endlessly there because SRV_WORKER thread is not suspended.

Indeed srv_purge_thread() just exits on shutdown, without suspension:

                /* If there are very few records to purge or the last
                purge didn't purge any records then wait for activity.
                We peek at the history len without holding any mutex
                because in the worst case we will end up waiting for
                the next purge event. */
                if (trx_sys->rseg_history_len < srv_purge_batch_size
                    || n_total_purged == 0) {

                        os_event_t      event;

                        event = srv_suspend_thread();

                        os_event_wait(event);
                }

                /* Check for shutdown and whether we should do purge at all. */
                if (srv_force_recovery >= SRV_FORCE_NO_BACKGROUND
                    || srv_shutdown_state != 0
                    || srv_fast_shutdown) {

                        break;
                }
[11 May 2010 3:02] Jimmy Yang
Ok to push.
[12 May 2010 18:39] Mikhail Izioumtchenko
there's still a minor glitch in the fix: 	srv_suspend_thread() should be called while holding the kernel mutex, as it is not (mysql-trunk-innodb r3092),
the ut_ad assert in srv_suspend_thread() fires when UNIV_DEBUG is on.
[15 Jun 2010 8:09] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:marko.makela@oracle.com-20100511104500-c6kzd0bg5s42p8e9) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:25] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:marko.makela@oracle.com-20100511104500-c6kzd0bg5s42p8e9) (pib:16)