Bug #46669 mysql freezes during ALTER TABLE
Submitted: 12 Aug 2009 5:50 Modified: 24 Aug 2009 7:31
Reporter: Tom Keyser Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0.82-percona-highperf-b16-log OS:Linux (Centos 5.3)
Assigned to: CPU Architecture:Any
Tags: ALTER TABLE, freeze, innodb

[12 Aug 2009 5:50] Tom Keyser
Description:
I have a hive of 16000 table in the mysql server. performance was suffering an I decided to optimize the tables. I wrote a sql script for each database to alter table for each table for the existing engine (InnoDB).

I was also changing the inno table to be table per file.

All other traffic to the mysql server was stopped.

During the process of running the first script everything stopped. PhpmyAdmin showed that the process was "renaming the tmp table" it never completed.

If other query requests come in at the same time this is happening they all wait on "Opening tables"

This appears to be a mutex deadlock problem.

How to repeat:
I can repeat it it happens every time I run one of my scripts, then again each database has between 400-1000 tables. It usually happens after about 150 table have been converted, but that is only an estimate.

Aug 11 18:22:55 localhost mysqld[8304]: 090811 18:22:55 [Warning] 'user' entry 'root@localhost.localdomain' ignored in --skip-name-resolve mode.
Aug 11 18:22:55 localhost mysqld[8304]: 090811 18:22:55 [Note] /usr/sbin/mysqld: ready for connections.
Aug 11 18:22:55 localhost mysqld[8304]: Version: '5.0.82-percona-highperf-b16-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Percona High Performance Edition (GPL)
Aug 11 18:42:22 localhost mysqld[8304]: InnoDB: Warning: a long semaphore wait:
Aug 11 18:42:23 localhost mysqld[8304]: --Thread 1133726016 has waited at buf0buf.c line 2150 for 241.00 seconds the semaphore:
Aug 11 18:42:23 localhost mysqld[8304]: Mutex at 0x2aaaeb2360b8 created file buf0buf.c line 549, lock var 1
Aug 11 18:42:23 localhost mysqld[8304]: waiters flag 1
Aug 11 18:42:23 localhost mysqld[8304]: InnoDB: Warning: a long semaphore wait:
Aug 11 18:42:23 localhost mysqld[8304]: --Thread 1097812288 has waited at buf0lru.c line 120 for 242.00 seconds the semaphore:
Aug 11 18:42:23 localhost mysqld[8304]: Mutex at 0x2aac72b94420 created file buf0buf.c line 498, lock var 1
Aug 11 18:42:23 localhost mysqld[8304]: waiters flag 1
Aug 11 18:42:23 localhost mysqld[8304]: InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
Aug 11 18:42:23 localhost mysqld[8304]: InnoDB: Pending preads 0, pwrites 0
Aug 11 18:42:37 localhost mysqld[8304]:
Aug 11 18:42:37 localhost mysqld[8304]: =====================================
Aug 11 18:42:37 localhost mysqld[8304]: 090811 18:42:37 INNODB MONITOR OUTPUT
Aug 11 18:42:37 localhost mysqld[8304]: =====================================
Aug 11 18:42:37 localhost mysqld[8304]: Per second averages calculated from the last 20 seconds
Aug 11 18:42:37 localhost mysqld[8304]: ----------
Aug 11 18:42:37 localhost mysqld[8304]: BACKGROUND THREAD
Aug 11 18:42:37 localhost mysqld[8304]: ----------
Aug 11 18:42:37 localhost mysqld[8304]: fsync callers: 2488 buffer pool, 3 other, 1015 checkpoint, 4163 log aio, 9815 log sync, 0 archive
Aug 11 18:42:37 localhost mysqld[8304]: ----------
Aug 11 18:42:37 localhost mysqld[8304]: SEMAPHORES
Aug 11 18:42:37 localhost mysqld[8304]: ----------
Aug 11 18:42:37 localhost mysqld[8304]: OS WAIT ARRAY INFO: reservation count 6465, signal count 6443
Aug 11 18:42:37 localhost mysqld[8304]: --Thread 1133726016 has waited at buf0buf.c line 2150 for 256.00 seconds the semaphore:
Aug 11 18:42:37 localhost mysqld[8304]: Mutex at 0x2aaaeb2360b8 created file buf0buf.c line 549, lock var 1
Aug 11 18:42:37 localhost mysqld[8304]: waiters flag 1
Aug 11 18:42:37 localhost mysqld[8304]: --Thread 1097812288 has waited at buf0lru.c line 120 for 256.00 seconds the semaphore:
Aug 11 18:42:37 localhost mysqld[8304]: Mutex at 0x2aac72b94420 created file buf0buf.c line 498, lock var 1
Aug 11 18:42:37 localhost mysqld[8304]: waiters flag 1
Aug 11 18:42:37 localhost mysqld[8304]: Mutex spin waits 0, rounds 105222, OS waits 501
Aug 11 18:42:37 localhost mysqld[8304]: RW-shared spins 11611, OS waits 5677; RW-excl spins 402, OS waits 251
Aug 11 18:42:37 localhost mysqld[8304]: --------
Aug 11 18:42:37 localhost mysqld[8304]: FILE I/O
Aug 11 18:42:37 localhost mysqld[8304]: --------
Aug 11 18:42:37 localhost mysqld[8304]: I/O thread 0 state: waiting for i/o request (insert buffer thread)
Aug 11 18:42:37 localhost mysqld[8304]: I/O thread 1 state: waiting for i/o request (log thread)
Aug 11 18:42:37 localhost mysqld[8304]: I/O thread 2 state: waiting for i/o request (read thread)
Aug 11 18:42:37 localhost mysqld[8304]: I/O thread 3 state: complete io for buf page (write thread) ev set
Aug 11 18:42:37 localhost mysqld[8304]: Pending normal aio reads: 0, aio writes: 58,
Aug 11 18:42:37 localhost mysqld[8304]:  ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Aug 11 18:42:37 localhost mysqld[8304]: Pending flushes (fsync) log: 0; buffer pool: 0
Aug 11 18:42:37 localhost mysqld[8304]: 132350 OS file reads, 34113 OS file writes, 17629 OS fsyncs
Aug 11 18:42:37 localhost mysqld[8304]: 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
Aug 11 18:42:37 localhost mysqld[8304]: -------------------------------------
Aug 11 18:42:37 localhost mysqld[8304]: INSERT BUFFER AND ADAPTIVE HASH INDEX
Aug 11 18:42:37 localhost mysqld[8304]: -------------------------------------
Aug 11 18:42:37 localhost mysqld[8304]: Ibuf: size 1, free list len 0, seg size 2,
Aug 11 18:42:37 localhost mysqld[8304]: 0 inserts, 0 merged recs, 0 merges
Aug 11 18:42:37 localhost mysqld[8304]: Hash table size 12750011, used cells 14013, node heap has 22 buffer(s)
Aug 11 18:42:37 localhost mysqld[8304]: 0.00 hash searches/s, 0.00 non-hash searches/s
Aug 11 18:42:37 localhost mysqld[8304]: ---
Aug 11 18:42:37 localhost mysqld[8304]: LOG
Aug 11 18:42:37 localhost mysqld[8304]: ---
Aug 11 18:42:37 localhost mysqld[8304]: Log sequence number 5 2466223470
Aug 11 18:42:37 localhost mysqld[8304]: Log flushed up to   5 2466220594
Aug 11 18:42:37 localhost mysqld[8304]: Last checkpoint at  5 1757976554
Aug 11 18:42:53 localhost mysqld[8304]: InnoDB: ###### Diagnostic info printed to the standard error stream

Suggested fix:
Analyze the copy to temp and rename processes and find the mutex deadlock
[12 Aug 2009 6:16] Tom Keyser
I have been told that the tmp file wont rename until certain type of commands stop. is it possible that mysql has lost track and has the lock on and is waiting for an event that wont happen?
[12 Aug 2009 7:07] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

According to http://dev.mysql.com/doc/refman/5.0/en/optimize-table.html: "For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE, which rebuilds the table to update index statistics and free unused space in the clustered index. " So this command will create temporary table and you have to wait when it finishes. Regarding to stalled queries this is true only for queries which use affected tables.
[12 Aug 2009 7:28] Tom Keyser
Yes, it is a bug.. after 2-3 minutes of waiting for the ALTER TABLE to finish renaming the file.. Its a bug.. And it does stop the other processes and they are not touching the table.. This wait time occurs even when their are no other processes servicing any requests for 2-3 minutes... I have all traffic into the machine blocked for mysql!
[12 Aug 2009 7:30] Tom Keyser
This machine is dedicated to mysql only! there are not http websites running on this machine.
[12 Aug 2009 7:32] Tom Keyser
I am not using "optimize" I am using direct "ALTER TABLE" commands
[12 Aug 2009 7:55] Tom Keyser
example of long wait.. no other traffic to mysql except phpmyadmin ( this just happened while I was reading your response)

Attachment: Image2.jpg (image/jpeg, text), 51.05 KiB.

[12 Aug 2009 7:58] Tom Keyser
How long do you think I should wait, before I kill mysqld?

Attachment: Image2.jpg (image/jpeg, text), 51.05 KiB.

[12 Aug 2009 7:59] Tom Keyser
Is this long enough?

Attachment: Image1.jpg (image/jpeg, text), 51.56 KiB.

[12 Aug 2009 8:10] Sveta Smirnova
Thank  you for the feedback.

> Is this long enough?

This depends from size of the table. Please provide output of SHOW CREATE TABLE ratioterms and SHOW TABLE STATUS LIKE 'ratioterms'
[12 Aug 2009 8:10] Tom Keyser
Notice my phpmyadmin screen never completed loading. And its now been stuck for 1233 seconds.

Attachment: Image3.jpg (image/jpeg, text), 155.83 KiB.

[12 Aug 2009 8:14] Tom Keyser
In that last image I also started one of my web servers to show you that their request are all blocked and none of them are trying to open the table being renamed. I have 1400 blogs being services by this mysql servers.

I think I have waited long enough having everything down while I prove to you this is a BUG! unless you thing that somehow phpmyadmin has this kind of power over mysql. If thats the case we are all in trouble.
[12 Aug 2009 8:17] Tom Keyser
I will get what you asked for after I kill mysqld and do a crash recovery. It wont respond to my request.
[12 Aug 2009 8:20] Tom Keyser
mysql> use timeout10;
Database changed
mysql> SHOW CREATE TABLE ratioterms;
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                                                                                                                                             |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ratioterms | CREATE TABLE `ratioterms` (
  `term_id` bigint(20) NOT NULL auto_increment,
  `name` varchar(200) NOT NULL default '',
  `slug` varchar(200) NOT NULL default '',
  `term_group` bigint(10) NOT NULL default '0',
  PRIMARY KEY  (`term_id`),
  UNIQUE KEY `slug` (`slug`),
  KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=49 DEFAULT CHARSET=utf8 |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SHOW TABLE STATUS LIKE 'ratioterms';
+------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+------------------------+
| Name       | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Collation       | Checksum | Create_options | Comment                |
+------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+------------------------+
| ratioterms | InnoDB |      10 | Compact    |   48 |            341 |       16384 |               0 |        32768 |         0 |             49 | 2009-08-12 00:48:04 | NULL        | NULL       | utf8_general_ci |     NULL |                | InnoDB free: 210944 kB |
+------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+------------------------+
1 row in set (0.00 sec)

mysql>
[12 Aug 2009 8:31] Tom Keyser
FYI, after 7 hours of work, I just finished using "alter table" on 16000 tables using 30 sql scripts, because it got locked up so many times. Each lock up required my to kill mysqld and do a crash recovery. That was about 40 lockups.

Another point that you should keep in mind while considering this issue is that I was moving all the tables back into the large centralized InnoDB file from single file InnoDB tables. The same issue happened a couple days ago when I was running the same process to move all the tables to single files.

I cant confirm that this problem exists on tables that already exist in the central InnoDB file, if they are not being moved out to single files.
[12 Aug 2009 18:15] Sveta Smirnova
Thank you for the feedback.

I can not repeat situation when ALTER TABLE and SHOW DATABASES lock each other using our developer sources.

Please try our current version 5.0.84 and if problem still exists provide your configuration file.
[23 Aug 2009 21:58] Tom Keyser
I have confirmed that this behavior does not exist in 5.0.84

Thank you for your assistance and patience.
[24 Aug 2009 7:31] Sveta Smirnova
Thank you for the feedback.

Closed as "Can't repeat" because last comment.

You can also open bug report in Percona's bugs tracker in case it this bug is in their binaries.