Bug #25645 | Assertion failure in file srv0srv.c; adaptive hash hang in a complex query | ||
---|---|---|---|
Submitted: | 16 Jan 2007 10:19 | Modified: | 18 Jun 2010 23:01 |
Reporter: | Eric ten Hoeve | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.26 | OS: | Linux (linux Red Hat) |
Assigned to: | Vasil Dimov | CPU Architecture: | Any |
Tags: | assertion failure, innodb, Semaphore, srv0srv.c |
[16 Jan 2007 10:19]
Eric ten Hoeve
[16 Jan 2007 10:49]
Valeriy Kravchuk
Thank you for a problem report. It can be a duplicate of bug #19081. Please, send the results of: uname -a (to check is it 64-bit Linux) and your my.cnf file content.
[16 Jan 2007 12:01]
Eric ten Hoeve
uname -a: Linux dbhost 2.6.17-1.2139_FC5 #1 SMP Fri Jun 23 12:40:11 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux my.cnf: [client] port=3306 socket=/var/lib/mysql/mysql.sock [mysqld] #innodb_force_recovery = 4 skip-slave-start slave-skip-errors=1396 skip-bdb log-bin = /var/lib/mysql/mysql-bin.log log-error = /var/lib/mysql/mysql.err relay-log = /var/lib/mysql/mysql-relay.log log-slow-queries = /var/lib/mysql/mysql-slow-query.log pid-file = /var/lib/mysql/mysql.pid #binlog-do-db = dd ddtemp server-id = 1 port=3306 log-slow-queries log-warnings max_allowed_packet=32M default-character-set=latin1 thread_concurrency=4 max_connections=2000 max_user_connections=300 max_connect_errors=10000 interactive_timeout=3600 wait_timeout=1800 connect_timeout=120 skip-external-locking query-cache-type = 1 query-cache-size = 128M default-character-set = latin1 default_storage_engine = InnoDB read_buffer_size=8M sort_buffer_size=16M key_buffer_size=128M table_cache=2000 tmp_table_size=67108864 datadir=/var/lib/mysql innodb_data_file_path = ibdata1:100M:autoextend innodb_buffer_pool_size = 4000M innodb_additional_mem_pool_size=20M innodb_log_file_size=500M innodb_log_buffer_size=16M innodb_flush_log_at_trx_commit=0 innodb_flush_method=O_DSYNC innodb_file_per_table innodb_open_files=2048 innodb_data_home_dir=/var/lib/mysql innodb_log_group_home_dir=/var/lib/mysql innodb_log_files_in_group=2 innodb_lock_wait_timeout=120 [mysqld_safe] datadir=/var/lib/mysql open_files_limit=16384 [mysqldump] set-variable = max_allowed_packet=32M [mysqr]
[16 Jan 2007 13:11]
Heikki Tuuri
Hi! Please post the entire .err log, gzipped if it is large. This might be 'thread thrashing', which is fixed in 5.0.30. Regards, Heikki
[16 Jan 2007 13:44]
Eric ten Hoeve
Hello Heikki, attached is the gzipped err log Regards, Eric
[16 Jan 2007 15:53]
Heikki Tuuri
The deadlock comes in the call of this function: dict0dict.c in 5.0.23: /************************************************************************ Decrements the count of open MySQL handles to a table. */ void dict_table_decrement_handle_count( /*==============================*/ dict_table_t* table) /* in: table */ { mutex_enter(&(dict_sys->mutex)); ut_a(table->n_mysql_handles_opened > 0); table->n_mysql_handles_opened--; mutex_exit(&(dict_sys->mutex)); } It is probably the call row_prebuilt_free() below that causes the call to dict_table_decrement_handle_count(). ha_innodb.cc: /********************************************************************** Closes a handle to an InnoDB table. */ int ha_innobase::close(void) /*====================*/ /* out: 0 */ { DBUG_ENTER("ha_innobase::close"); row_prebuilt_free((row_prebuilt_t*) innobase_prebuilt); my_free((gptr) upd_buff, MYF(0)); free_share(share); /* Tell InnoDB server that there might be work for utility threads: */ srv_active_wake_master_thread(); DBUG_RETURN(0); } Hmm... we could add to the start of the above function: innobase_release_temporary_latches(current_thd); The join query is on 4 tables. But the printout says that MySQL has 5 tables in use. Maybe MySQL opens some table several times, and at the query end closes the table handle before calling ::external_lock() on the rest of tables: ::external_lock() in ha_innodb.cc: /* MySQL is releasing a table lock */ trx->n_mysql_tables_in_use--; prebuilt->mysql_has_locked = FALSE; /* If the MySQL lock count drops to zero we know that the current SQL statement has ended */ if (trx->n_mysql_tables_in_use == 0) { trx->mysql_n_tables_locked = 0; prebuilt->used_in_HANDLER = FALSE; /* Release a possible FIFO ticket and search latch. Since we may reserve the kernel mutex, we have to release the search system latch first to obey the latching order. */ innobase_release_stat_resources(trx); if (!(thd->options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))) { if (trx->active_trans != 0) { innobase_commit(thd, TRUE); } } else { if (trx->isolation_level <= TRX_ISO_READ_COMMITTED && trx->global_read_view) { /* At low transaction isolation levels we let each consistent read set its own snapshot */ read_view_close_for_mysql(trx); } } } if (trx->n_mysql_tables_in_use == 0) { trx->mysql_n_tables_locked = 0; prebuilt->used_in_HANDLER = FALSE; /* Release a possible FIFO ticket and search latch. Since we may reserve the kernel mutex, we have to release the search system latch first to obey the latching order. */ innobase_release_stat_resources(trx); if (!(thd->options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))) { if (trx->active_trans != 0) { innobase_commit(thd, TRUE); } } else { if (trx->isolation_level <= TRX_ISO_READ_COMMITTED && trx->global_read_view) { /* At low transaction isolation levels we let each consistent read set its own snapshot */ read_view_close_for_mysql(trx); } } } That could explain the hang. But it is not clear if this is the reason.
[16 Jan 2007 16:02]
Heikki Tuuri
We could also move the call: innobase_release_stat_resources(trx); outside the 'if' in ::external_lock(). That would add more safety that whatever MySQL does at a query end, there would be no risk of a hang on the btr search latch.
[23 Jan 2007 2:35]
Carl Dunham
Just for clarification, the status of this bug is "Patch Pending". Does that mean a root cause has been identified and fixed in a patch, or in code "pending" being made into a patch? Also, is this 19081, or something different? We are seeing this on an amd64 system running 5.0.30 as well. If you need more information, logs, etc., we would be more than happy to provide it. This is biting us about twice a week now...
[23 Jan 2007 14:22]
Heikki Tuuri
Carl, the two suggestions I made above are the 'pending patch'. I need to test them, and another developer needs to approve them. I am very optimistic that either of the patches solves this bug. If you compile mysqld yourself, you can test the above patches. Bug #19081 is probably an entirely different bug. Regards, Heikki
[12 Apr 2007 14:31]
Heikki Tuuri
Assigning this to Vasil.
[13 Apr 2007 10:36]
Eric ten Hoeve
[root@dbhost mysql]# ls -l /var/lib/mysql total 20080672 -rw-r--r-- 1 root root 0 Mar 1 16:39 binlog.txt -rw-rw---- 1 mysql mysql 148 Apr 12 21:02 dbhost-relay-bin.000001 -rw-rw---- 1 mysql mysql 98 Apr 12 21:02 dbhost-relay-bin.000002 -rw-rw---- 1 mysql mysql 117 Apr 13 02:25 dbhost-relay-bin.000003 -rw-rw---- 1 mysql mysql 117 Apr 13 10:45 dbhost-relay-bin.000004 -rw-rw---- 1 mysql mysql 1768 Apr 13 10:45 dbhost-relay-bin.index lrwxrwxrwx 1 mysql mysql 10 Oct 4 2006 dd -> /mysql/dd/ drwxrwxr-x 2 mysql mysql 3252224 Apr 13 12:30 ddtemp drwxrwxr-x 2 mysql mysql 4096 Jul 12 2006 ddvast -rw-r--r-- 1 mysql mysql 31 Jun 19 2006 ibbackup_binlog_info -rw-r----- 1 mysql mysql 18215862272 Apr 13 12:30 ibdata1 -rw-r----- 1 mysql mysql 524288000 Apr 13 12:30 ib_logfile0 -rw-r----- 1 mysql mysql 524288000 Apr 13 12:30 ib_logfile1 -rw-r--r-- 1 root root 312605 Mar 1 16:46 log.zip -rw-rw---- 1 mysql mysql 74 Apr 10 12:37 master.info drwx--x--x 2 mysql mysql 4096 Apr 13 00:27 mysql -rw-rw---- 1 mysql mysql 996626413 Apr 12 21:02 mysql-bin.000371 -rw-rw---- 1 mysql mysql 68617586 Apr 13 02:24 mysql-bin.000372 -rw-rw---- 1 mysql mysql 132938389 Apr 13 10:44 mysql-bin.000373 -rw-rw---- 1 mysql mysql 39935096 Apr 13 12:30 mysql-bin.000374 -rw-rw---- 1 mysql mysql 128 Apr 13 10:45 mysql-bin.index -rw-rw---- 1 mysql mysql 49296 Apr 13 12:30 mysql.err -rw-rw---- 1 mysql mysql 55088 Apr 12 21:02 mysql.err-old -rw-rw---- 1 mysql mysql 6 Apr 13 10:45 mysql.pid -rw-rw---- 1 mysql mysql 11845837 Apr 13 12:24 mysql-slow-query.log -rw-r----- 1 root root 3676911 Feb 11 00:39 mysql-slow-query.log_pre34enterpr -rw-r----- 1 mysql mysql 20578407 Jan 18 09:35 mysql-slow-query.old srwxrwxrwx 1 mysql mysql 0 Apr 13 10:45 mysql.sock -rw-rw---- 1 mysql mysql 31 Apr 13 10:45 relay-log.info drwxr-xr-x 2 mysql mysql 4096 Jan 9 14:37 test drwxr-xr-x 2 mysql mysql 12288 Apr 4 06:30 tmp We have a software raid 1+0 setup and we do have a faulty disk. Also deleting the table from our replication server (which runs 5.38 enterprise) succeeds as expected.
[13 Apr 2007 10:43]
Eric ten Hoeve
sorry, the last entry was for another bug. Please disregard the entry.
[13 Apr 2007 12:46]
Heikki Tuuri
I am pretty sure that these two changes fix the problem, if this is a bug in MySQL/InnoDB. I do not know if this bug is from MySQL closing the table handle in an abnormal order, or is there something else wrong in the hardware/OS/mysqld. Regards, Heikki
[12 Jul 2007 17:28]
Timothy Smith
Queued to 5.0- and 5.1-maint team trees
[19 Jul 2007 15:47]
Bugs System
Pushed into 5.1.21-beta
[19 Jul 2007 15:49]
Bugs System
Pushed into 5.0.48
[23 Jul 2007 20:03]
Paul DuBois
Noted in 5.0.48, 5.1.21 changelogs. Several InnoDB assertion failures were corrected.
[30 Aug 2007 20:21]
Igor Green
we had mysql 5.0.45 crashed on us with similar problems pretty frequently some time ago. I checked out the latest mysql code from bitkeeper and it seems to be a little better, but mysql still crashes several times a day. Now the error is: ---------------------------- END OF INNODB MONITOR OUTPUT ============================ InnoDB: ###### Diagnostic info printed to the standard error stream InnoDB: Error: semaphore wait has lasted > 600 seconds InnoDB: We intentionally crash the server, because it appears to be hung. 070830 7:07:52InnoDB: Assertion failure in thread 138661376 in file srv0srv.c line 2095 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html InnoDB: about forcing recovery. 070830 7:07:52 - mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. --- I can provide more info about the crash.
[5 Sep 2007 14:56]
Heikki Tuuri
Updating the synopsis. This bug often occurs when a complex query calls ::close() while it still holds the adaptive hash latch.
[5 Sep 2007 14:57]
Heikki Tuuri
See http://bugs.mysql.com/bug.php?id=20358
[5 May 2010 15:17]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:54]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 5:57]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:26]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:54]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 15:13]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[17 Jun 2010 12:00]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:40]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:25]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)