Bug #36795 | Backup crash in backup::Mem_allocator::free at kernel.cc:911 | ||
---|---|---|---|
Submitted: | 19 May 2008 8:44 | Modified: | 26 Aug 2008 20:04 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Backup | Severity: | S1 (Critical) |
Version: | 6.0-backup | OS: | Any |
Assigned to: | Øystein Grøvlen | CPU Architecture: | Any |
[19 May 2008 8:44]
Philip Stoev
[19 May 2008 18:34]
Philip Stoev
To reproduce, please use the second test case from bug 34547, available at: http://bugs.mysql.com/file.php?id=9350 Please place the .txt files in mysql-test and the .test files in mysql-test/t. Then run: $ perl ./mysql-test-run.pl --stress --stress-init-file=bug34547_2_init.txt --stress-test-file=bug34547_2_run.txt --stress-test-duration=60 --stress-threads=5 --skip-ndb --mysqld=--skip-innodb This test will crash in one of several ways, each connected with memory management. Ideally, The test should run without issues until bug 34547 is observed at server shutdown.
[29 Jul 2008 11:19]
Øystein Grøvlen
I have seen similar core dumps while experiementing with a way to reproduce Bug#36792.
[31 Jul 2008 7:33]
Øystein Grøvlen
When running the test case supplied, I either get the mentioned seg fault, or the following assert: mysqld: kernel.cc:1138: bstream_byte* bstream_alloc(long unsigned int): Assertion `Backup_restore_ctx::mem_alloc' failed. #0 0x0000003f1100b122 in pthread_kill () from /lib64/libpthread.so.0 #1 0x0000000000a32320 in my_write_core (sig=6) at stacktrace.c:307 #2 0x000000000064ba79 in handle_segfault (sig=6) at mysqld.cc:2654 #3 <signal handler called> #4 0x0000003f10430045 in raise () from /lib64/libc.so.6 #5 0x0000003f10431ae0 in abort () from /lib64/libc.so.6 #6 0x0000003f10429756 in __assert_fail () from /lib64/libc.so.6 #7 0x0000000000a9cd53 in bstream_alloc (size=<value optimized out>) at kernel.cc:1138 #8 0x0000000000ab34c1 in bstream_open_wr (s=0x837dc78, block_size=29137, offset=6) at stream_v1_transport.c:837 #9 0x0000000000aacbb2 in backup::Output_stream::init (this=0x837dc70) at stream.cc:289 #10 0x0000000000aad2b3 in backup::Output_stream::open (this=0x837dc70) at stream.cc:352 #11 0x0000000000a9eaed in Backup_restore_ctx::prepare_for_backup ( this=0x47b75be0, location= {str = 0x8425550 "/home/og136792/mysql/shared/mysql-6.0-backup-clean/mysql-test/var/backup7", length = 73}, query=<value optimized out>, with_compression=false) at kernel.cc:503 #12 0x0000000000aa0400 in execute_backup_command (thd=0x841a638, lex=0x841c090) at kernel.cc:144 #13 0x00000000006575e4 in mysql_execute_command (thd=0x841a638) at sql_parse.cc:2172 #14 0x000000000065df74 in mysql_parse (thd=0x841a638, inBuf=0x8425150 "BACKUP DATABASE test TO \"/home/og136792/mysql/shared/mysql-6.0-backup-clean/mysql-test/var/backup7\"", length=99, found_semicolon=0x47b77028) at sql_parse.cc:5800 #15 0x000000000065ecac in dispatch_command (command=COM_QUERY, thd=0x841a638, packet=<value optimized out>, packet_length=99) at sql_parse.cc:1050 #16 0x000000000065fc07 in do_command (thd=0x841a638) at sql_parse.cc:723 #17 0x0000000000650b91 in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1153 #18 0x0000003f110062e7 in start_thread () from /lib64/libpthread.so.0 #19 0x0000003f104ce3bd in clone () from /lib64/libc.so.6
[31 Jul 2008 8:06]
Øystein Grøvlen
The following test script reproduces the assert: ====== CREATE DATABASE backup_concurrent; USE backup_concurrent; CREATE TABLE t ( t1 INTEGER NOT NULL, t2 CHAR(36), PRIMARY KEY (t1) ); connect (backup1,localhost,root,,); USE backup_concurrent; send BACKUP DATABASE backup_concurrent TO 'backup1'; # Second backup should fail because another backup is running connection default; --error ER_BACKUP_RUNNING BACKUP DATABASE backup_concurrent TO 'backup2'; INSERT INTO t VALUES (1, 'test'); BACKUP DATABASE backup_concurrent TO 'backup3';
[31 Jul 2008 8:26]
Øystein Grøvlen
The reason for the failing assert is related to that Backup_restore_ctx::mem_alloc is static. The failing backup will set mem_alloc to null when terminating. The next time the running backup wants to allocate memory, the assert will fail. I suggest that we make mem_alloc non-static. That way, concurrent backups will not interfere. This requires that it is possible to bstream_alloc to find the right Backup_restore_ctx to use. Suggest to fix that by changing the static is_running flag to a static pointer to the Backup_restore_ctx of the currently running backup. If the pointer is null, it means that no backup is currently running.
[1 Aug 2008 13:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50809 2674 Oystein Grovlen 2008-08-01 Bug#36795 Concurrency issues when starting backups in parallel. Raise condition on Backup_restore_ctx::mem_alloc since it is static. The failing backup will set mem_alloc to null when terminating. The next time the running backup wants to allocate memory, an assert will fail. Makes mem_alloc non-static. That way, concurrent backups will not interfere. This requires that it is possible to bstream_alloc to find the right Backup_restore_ctx to use. Fixes that by changing the static is_running flag to a static pointer to the Backup_restore_ctx of the currently running backup. If the pointer is null, it means that no backup is currently running.
[6 Aug 2008 11:28]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50989 2676 Oystein Grovlen 2008-08-06 Bug#36795 Concurrency issues when starting backups in parallel. Raise condition on Backup_restore_ctx::mem_alloc since it is static. The failing backup will set mem_alloc to null when terminating. The next time the running backup wants to allocate memory, an assert will fail. Makes mem_alloc non-static. That way, concurrent backups will not interfere. This requires that it is possible to bstream_alloc to find the right Backup_restore_ctx to use. Fixes that by changing the static is_running flag to a static pointer, current_op, to the Backup_restore_ctx of the currently running backup. If the pointer is null, it means that no backup is currently running.
[7 Aug 2008 15:45]
Chuck Bell
Patch approval condition on the following: Please add: SET DEBUG_SYNC= 'reset'; to the end of your backup_concurrent test. This command is require to place the debug sync facility in a stable state. Without it, any test that follows that uses DEBUG_SYNC could be skipped (not fail) -- see backup_ddl_blocker: main.backup_concurrent [ pass ] 483 main.backup_ddl_blocker [ skipped ] Query 'SELECT ('$value' LIKE 'ON %' ) AS debug_sync' failed, required functionality not supported
[8 Aug 2008 9:39]
Jørgen Løland
Good to push
[8 Aug 2008 11:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51184 2678 Oystein Grovlen 2008-08-08 Bug#36795 Concurrency issues when starting backups in parallel. Raise condition on Backup_restore_ctx::mem_alloc since it is static. The failing backup will set mem_alloc to null when terminating. The next time the running backup wants to allocate memory, an assert will fail. Makes mem_alloc non-static. That way, concurrent backups will not interfere. This requires that it is possible to bstream_alloc to find the right Backup_restore_ctx to use. Fixes that by changing the static is_running flag to a static pointer, current_op, to the Backup_restore_ctx of the currently running backup. If the pointer is null, it means that no backup is currently running.
[8 Aug 2008 11:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51185 2678 Oystein Grovlen 2008-08-08 Bug#36795 Concurrency issues when starting backups in parallel. Raise condition on Backup_restore_ctx::mem_alloc since it is static. The failing backup will set mem_alloc to null when terminating. The next time the running backup wants to allocate memory, an assert will fail. Makes mem_alloc non-static. That way, concurrent backups will not interfere. This requires that it is possible to bstream_alloc to find the right Backup_restore_ctx to use. Fixes that by changing the static is_running flag to a static pointer, current_op, to the Backup_restore_ctx of the currently running backup. If the pointer is null, it means that no backup is currently running.
[8 Aug 2008 11:35]
Øystein Grøvlen
Pushed up to revision 2678.
[26 Aug 2008 12:33]
Øystein Grøvlen
Pushed into main for 6.0.7. Documentation input: Server crashed when starting a new Backup or Restore command while a Backup or Restore was ongoing.
[26 Aug 2008 20:04]
Paul DuBois
Noted in 6.0.7 changelog.
[13 Sep 2008 22:39]
Bugs System
Pushed into 6.0.6-alpha (revid:oystein.grovlen@sun.com-20080808111737-t4tpz8zwgnr3a79l) (version source revid:hakan@mysql.com-20080716105246-eg0utbybp122n2w9) (pib:3)