Bug #26079 max_binlog_size + innodb = not make new binlog and hang server
Submitted: 5 Feb 2007 12:33 Modified: 31 Mar 2007 23:00
Reporter: T M Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.1.14(16), 5.1.17BK OS:Linux (RHEL4 AS x86_64, suse9.3x86)
Assigned to: Andrei Elkin CPU Architecture:Any
Tags: hang, innodb

[5 Feb 2007 12:33] T M
Description:
Hello,
from mysql 5.1.12+, when I use max_binlog_size, mysql hang when rotate binlog, but only if you write to innodb tables. I use actualy Redhat4AS,x86_64 mysql_huge config for example(but it no depends). If I dont use max_binlog_size, we reached 1GB binlog in 5hours, due to heavy traffic and then mysql hang again. Hang mean not crashed, mysql is runnig and accepting new connection, but dont handle this connection. Has anyone experience with this? Last version, which was allright, was 5.1.11. I tried actual dev, and it is same. I also tried it on Centos 4.4 i386 machine.

 

How to repeat:
Make test table (id int autoincrement primary key, text varchar(255)) engine=INNODB... 
Set:
log-bin=log
max_binlog_size= 5000 (smallest value is 4096) and try to fill this table. When mysql can make log.00002, it hangs.
[5 Feb 2007 13:02] Heikki Tuuri
Hi!

I have observed a binlog hang with our special 5.1-zip binary. I need to check if I can repeat it with a plain 5.1.14. All threads end up waiting for the binlog mutex if I run a multithreaded stress test.

Please post SHOW INNODB STATUS\G and SHOW PROCESSLIST during the hang.

Regards,

Heikki
[5 Feb 2007 13:11] MySQL Verification Team
Updating status regarding Heikki question.
[5 Feb 2007 18:16] T M
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
070205 19:13:51 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 18 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 3, signal count 3
Mutex spin waits 1, rounds 20, OS waits 0
RW-shared spins 6, OS waits 3; RW-excl spins 1, OS waits 0
------------
TRANSACTIONS
------------
Trx id counter 0 185957
Purge done for trx's n:o < 0 181541 undo n:o < 0 0
History list length 3
Total number of lock structs in row lock hash table 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 0, not started, process no 5986, OS thread id 1158191456
MySQL thread id 1, query id 108 localhost root
SHOW INNODB STATUS
---TRANSACTION 0 185956, not started, process no 5986, OS thread id 1158457696
mysql tables in use 1, locked 1
MySQL thread id 2, query id 107 localhost root update
INSERT INTO test (text) VALUES ('WJgADyfIeT')
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
36 OS file reads, 208 OS file writes, 206 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2,
0 inserts, 0 merged recs, 0 merges
Hash table size 17393, used cells 0, node heap has 1 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 0 68713738
Log flushed up to   0 68713738
Last checkpoint at  0 68713738
0 pending log writes, 0 pending chkp writes
209 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 22350052; in additional pool allocated 678912
Dictionary memory allocated 38200
Buffer pool size   512
Free buffers       481
Database pages     30
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 30, created 0, written 3
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 5986, id 1157658976, state: waiting for server activity
Number of rows inserted 100, updated 0, deleted 0, read 1
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

1 row in set, 1 warning (0.00 sec)

mysql> SHOW PROCESSLIST\G;
*************************** 1. row ***************************
     Id: 1
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: SHOW PROCESSLIST
*************************** 2. row ***************************
     Id: 2
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 173
  State: update
   Info: INSERT INTO test (text) VALUES ('WJgADyfIeT')
2 rows in set (0.00 sec)
[12 Feb 2007 11:17] Heikki Tuuri
Starting program: /home/heikki/mysql-5.1/sql/mysqld
[Thread debugging using libthread_db enabled]
[New Thread 1075693344 (LWP 9588)]
[New Thread 1084083120 (LWP 9589)]
070212 12:21:21 [Warning] Changed limits: max_open_files: 1024  max_connections: 886  table_cache: 64
[Thread 1084083120 (zombie) exited]
[New Thread 1084083120 (LWP 9590)]
[New Thread 1204808624 (LWP 9591)]
[New Thread 1213197232 (LWP 9592)]
[New Thread 1221585840 (LWP 9593)]
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
070212 12:21:21  InnoDB: Setting file ./ibdata1 size to 10 MB
InnoDB: Database physically writes the file full: wait...
070212 12:21:21  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 20 MB
InnoDB: Database physically writes the file full: wait...
070212 12:21:22  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 20 MB
InnoDB: Database physically writes the file full: wait...
[New Thread 1230396336 (LWP 9594)]
[New Thread 1238784944 (LWP 9595)]
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
[New Thread 1249291184 (LWP 9596)]
070212 12:21:23  InnoDB: Started; log sequence number 0 0
[New Thread 1257880496 (LWP 9597)]
070212 12:21:23 [Note] /home/heikki/mysql-5.1/sql/mysqld: ready for connections.
Version: '5.1.16-beta-debug-log'  socket: '/home/heikki/bugsocket'  port: 3307  Source distribution
070212 12:21:23 [Note] SCHEDULER: Loaded 0 events
[New Thread 1258077104 (LWP 9601)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1075693344 (LWP 9588)]
0x40178a27 in select () from /lib/tls/libc.so.6
(gdb) info threads
  11 Thread 1258077104 (LWP 9601)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  10 Thread 1257880496 (LWP 9597)  0x4003f7ef in do_sigwait ()
   from /lib/tls/libpthread.so.0
  9 Thread 1249291184 (LWP 9596)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  8 Thread 1238784944 (LWP 9595)  0x40178a27 in select ()
   from /lib/tls/libc.so.6
  7 Thread 1230396336 (LWP 9594)  0x40178a27 in select ()
   from /lib/tls/libc.so.6
  6 Thread 1221585840 (LWP 9593)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  5 Thread 1213197232 (LWP 9592)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  4 Thread 1204808624 (LWP 9591)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  3 Thread 1084083120 (LWP 9590)  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
* 1 Thread 1075693344 (LWP 9588)  0x40178a27 in select ()
   from /lib/tls/libc.so.6
(gdb) thread 12
Thread ID 12 not known.
(gdb) thread 11
[Switching to thread 11 (Thread 1258077104 (LWP 9601))]#0  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
(gdb) bt
#0  0x4003c295 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x0862d73f in safe_cond_wait (cond=0x891b2f0, mp=0x891b2b0,
    file=0x874bbbc "log.cc", line=3217) at thr_mutex.c:202
#2  0x0830eb6b in MYSQL_BIN_LOG::new_file_impl (this=0x891af00,
    need_lock=false) at log.cc:3217
#3  0x0830e95b in MYSQL_BIN_LOG::new_file_without_locking (this=0x891af00)
    at log.cc:3170
#4  0x0831066f in MYSQL_BIN_LOG::rotate_and_purge (this=0x891af00, flags=1)
    at log.cc:3886
#5  0x08310bb2 in MYSQL_BIN_LOG::write (this=0x891af00, thd=0x8f5d900,
    cache=0x8f9d9e0, commit_event=0x891b9e0) at log.cc:4043
#6  0x0830a2b9 in binlog_end_trans (thd=0x8f5d900, trx_data=0x8f9d9e0,
    end_ev=0x891b9e0, all=false) at log.cc:1500
#7  0x0830a547 in binlog_commit (hton=0x89455b8, thd=0x8f5d900, all=false)
    at log.cc:1573
#8  0x0835fbdd in ha_commit_one_phase (thd=0x8f5d900, all=false)
    at handler.cc:763
#9  0x0835fa42 in ha_commit_trans (thd=0x8f5d900, all=false) at handler.cc:733
#10 0x0835ff85 in ha_autocommit_or_rollback (thd=0x8f5d900, error=0)
    at handler.cc:871
#11 0x082eb9be in mysql_insert (thd=0x8f5d900, table_list=0x8fa5fa8,
    fields=@0x8fce3cc, values_list=@0x8fce3f0, update_fields=@0x8fce3e4,
    update_values=@0x8fce3d8, duplic=DUP_ERROR, ignore=false)
    at sql_insert.cc:683
#12 0x082832b9 in mysql_execute_command (thd=0x8f5d900) at sql_parse.cc:3462
#13 0x082f8e0a in Prepared_statement::execute (this=0x8fcde68,
    expanded_query=0x4afcb320, open_cursor=false) at sql_prepare.cc:2998
#14 0x082f7237 in mysql_stmt_execute (thd=0x8f5d900,
    packet_arg=0x9007a41 "%\004", packet_length=10) at sql_prepare.cc:2297
#15 0x0827f42e in dispatch_command (command=COM_STMT_EXECUTE, thd=0x8f5d900,
    packet=0x9007a41 "%\004", packet_length=10) at sql_parse.cc:1818
#16 0x0827ecb3 in do_command (thd=0x8f5d900) at sql_parse.cc:1628
#17 0x0827dd2c in handle_one_connection (arg=0x8f5d900) at sql_parse.cc:1234
#18 0x40039b63 in start_thread () from /lib/tls/libpthread.so.0
#19 0x4017f18a in clone () from /lib/tls/libc.so.6
(gdb)
[12 Feb 2007 11:18] Heikki Tuuri
I am able to repeat this bug very easily. Just run ibtest3 (attached) for 30 seconds. My my.cnf is attached.
[12 Feb 2007 11:24] Heikki Tuuri
Heikki's my.cnf

Attachment: my26079.cnf (application/octet-stream, text), 1.32 KiB.

[12 Feb 2007 11:25] Heikki Tuuri
Heikki's ibtest3 Perl test program

Attachment: ibtest3 (text/plain), 4.07 KiB.

[12 Feb 2007 11:28] Heikki Tuuri
The reason for the hang is that log rotate is called by the same thread that has prepared a transaction. Then prepared_xids == 1, and it will never become 0.

3203      /*
3204        if binlog is used as tc log, be sure all xids are "unlogged",
(gdb)
3205        so that on recover we only need to scan one - latest - binlog file
3206        for prepared xids. As this is expected to be a rare event,
3207        simple wait strategy is enough. We're locking LOCK_log to be sure no
3208        new Xid_log_event's are added to the log (and prepared_xids is not
3209        increased), and waiting on COND_prep_xids for late threads to
3210        catch up.
3211      */
3212      if (prepared_xids)
3213      {
3214        tc_log_page_waits++;
(gdb)
3215        pthread_mutex_lock(&LOCK_prep_xids);
3216        while (prepared_xids)
3217          pthread_cond_wait(&COND_prep_xids, &LOCK_prep_xids);
3218        pthread_mutex_unlock(&LOCK_prep_xids);
3219      }
3220
3221      /* Reuse old name if not binlog and not update log */
3222      new_name_ptr= name;
3223
3224      /*
(gdb) print prepared_xids
$1 = 1
(gdb)
[7 Mar 2007 8:16] MySQL Verification Team
heikki, is this bug verified or in progress ?  seems 'open' state is not correct here.
[8 Mar 2007 14:34] Heikki Tuuri
Shane,

this looks like a MySQL server bug in binlogging.

Regards,

Heikki
[8 Mar 2007 15:24] MySQL Verification Team
this testcase caused my 5.1.17BK to hang.  Started with --max_binlog_size=10M --log-bin

--------
delimiter ;
drop table if exists `bug26079`;
drop procedure if exists `sp26079`;
create table `bug26079`(`a` varchar(255),`b` varchar(255),`c` varchar(255))engine=innodb;
delimiter //
create procedure `sp26079`()
begin
        declare i int default 0;
        while(i < 1000000) do
                insert into `bug26079`(`a`,`b`,`c`) values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa','bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb','cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc');
                set i=i+1;
        end while;
        

end //

delimiter ;

#now open 4 mysql client connections and run:
call `sp26079`();
-------

mysql> show processlist;
+----+------+------+------+---------+------+--------+--------------------------------
| Id | User | Host | db   | Command | Time | State  | Info
+----+------+------+------+---------+------+--------+--------------------------------
|  1 | root |      | test | Query   |  104 | update | insert into `bug26079`(`a`,`b`,
|  2 | root |      | test | Query   |  104 | update | insert into `bug26079`(`a`,`b`,
|  3 | root |      | test | Query   |  104 | update | insert into `bug26079`(`a`,`b`,
|  4 | root |      | test | Query   |  104 | update | insert into `bug26079`(`a`,`b`,
|  8 | root |      | test | Query   |    0 | NULL   | show processlist

-rw-rw----  1 sbester users 10485819 2007-03-08 17:09 www-bin.000001
-rw-rw----  1 sbester users       17 2007-03-08 16:56 www-bin.index
-rw-rw----  1 sbester users     2222 2007-03-08 16:56 www.err
-rw-rw----  1 sbester users        6 2007-03-08 16:56 www.pid
[12 Mar 2007 15:53] Reinis Rozitis
Can confirm this also hapens on Suse (SLES and OpenSuse 10.x) with 5.1.16

The only way to operate is to switch binlog off at all (which is pretty unsafe) or to manually (or with script) execute FLUSH LOGS (the problem is you have flush before the server reaches max_binlog_size which does not happen in consistent way - so you end up doing logrotate each hour or even more often).
[14 Mar 2007 17:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/21906

ChangeSet@1.2477, 2007-03-14 17:00:22+02:00, aelkin@dsl-hkibrasgw1-ff1ec100-106.dhcp.inet.fi +5 -0
  Bug #26079 max_binlog_size + innodb = not make new binlog and hang server
  
  Hanging is at binlog_commit by a thread that is going to rotate binlog.
  The thread incremented MYSQL_BIN_LOG::prepared_xids itself and is waiting
  for the value dropped to zero which can not happend because itself it
  is decrementing only after storages commit.
  
  The solution exploits the fact that the waiter does not have any duty to
  perform inside of binlog_commit except rotate_and_purge.
  The latter is anyway scheduled for execution at TL_LOG::unlog.
[20 Mar 2007 8:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22329

ChangeSet@1.2477, 2007-03-20 10:50:10+02:00, aelkin@dsl-hkibras1-ff1dc300-249.dhcp.inet.fi +5 -0
  Bug #26079 max_binlog_size + innodb = not make new binlog and hang server
  
  There was hanging at binlog_commit by a thread executing autocommit query.
  
  The hang appeared to be due to an overly condtion for early return 
  from binlog_commit introduced by bug#20265 fix.
  
  Fixed with reverting the logic back to 5.0 version.
[20 Mar 2007 9:41] Guilhem Bichot
approved, asked Andrei to check with Mats and Serg about two lines he changes.
[22 Mar 2007 20:09] Mads Martin Joergensen
Fixed in 5.1.17
[31 Mar 2007 23:00] Paul DuBois
Noted in 5.1.17 changelog.

The server could hang during binary log rotation.