Bug #65518 | Test rpl.rpl_innodb_bug28430 fails on certain hardware | ||
---|---|---|---|
Submitted: | 5 Jun 2012 8:55 | Modified: | 30 Jan 2013 18:38 |
Reporter: | Andreas Mock | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Tests | Severity: | S3 (Non-critical) |
Version: | 5.5.25 | OS: | Linux (Scientific Linux release 6.2) |
Assigned to: | CPU Architecture: | Any | |
Tags: | test failure rpl_innodb_bug28430 RHEL |
[5 Jun 2012 8:55]
Andreas Mock
[5 Jun 2012 9:02]
Andreas Mock
machine information
Attachment: machine-info.txt (text/plain), 44.04 KiB.
[5 Jun 2012 9:43]
Andreas Mock
Output of test run
Attachment: mysql-test.out (application/octet-stream, text), 36.85 KiB.
[5 Jun 2012 16:54]
Sveta Smirnova
Thank you for the report. Backtrace shows issue is in fsync operation. Please check disk on problematic machine.
[5 Jun 2012 18:12]
Andreas Mock
Hi Sveta, I have two identical machines. I can reproduce test failure on both. On a third machine which has different hardware the test doesn't fail. For what kind of things concerning disks shall I watch out? /var/log/messages doesn't show up an error message. Hints welcome. Best regards Andreas
[5 Jun 2012 18:37]
Sveta Smirnova
Thank you for the feedback. I want you to ensure: 1. Disks have not damages. Any utility which your OS has can be used to check this. On Linux you can use fsck or any other utility your OS has. 2. If disks are OK please inform us about model of disks.
[5 Jun 2012 19:33]
Andreas Mock
* Booted in rescue mode * Made forced filesystem check on all filesystems * => no errors reported * hard drives are two Seagate ST3000DM001-9YN166 * disk setup: - partitions are linux raid - raid mirror of the installed two disks - LVM volume groups on top of raid mirror - logical volumes sourced by volume groups - file system ext4 How do you know from stack trace that fsync was the cause of timeout? Best regards Andreas
[5 Jun 2012 19:42]
Sveta Smirnova
Thank you for the feedback. Will try to find similar disks to test. Regarding your question. After last successful query there is a call of stored procedure which updates InnoDB table and information about thread which does writing job in the trace: ... Thread 14 (Thread 0x7f1d18c77700 (LWP 20456)): #0 0x0000003a3ee0ebad in fsync () from /lib64/libpthread.so.0 #1 0x000000000097023f in os_file_fsync (file=9) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/os/os0file.c:2058 #2 os_file_flush_func (file=9) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/os/os0file.c:2156 #3 0x0000000000922654 in pfs_os_file_flush_func (space_id=<value optimized out>) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/include/os0file.ic:381 #4 fil_flush (space_id=<value optimized out>) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/fil/fil0fil.c:4685 #5 0x000000000095c5d6 in log_write_up_to (lsn=<value optimized out>, wait=<value optimized out>, flush_to_disk=1) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/log/log0log.c:1514 #6 0x00000000008b5d02 in trx_prepare_off_kernel (trx=<value optimized out>) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/trx/trx0trx.c:2005 #7 0x00000000008b7993 in trx_prepare_for_mysql (trx=0x7f1d0400e598) at /opt/data/install/mysql/mysql-5.5.25/storage/innobase/trx/trx0trx.c:2041 ....
[5 Jun 2012 20:03]
Andreas Mock
How do you know that the cited thread dump is the thread causing the problem? Sorry, but I'm not used to read dumps with several threads. Which instance is causing the timeout? Locking code? Mutex code? From which instance is the sigabrt being sent?
[8 Jun 2012 12:40]
Andreas Mock
Can reproduce with 5.5.22 on same hardware. Third instance of same machine type.
[30 Jan 2013 18:38]
Sveta Smirnova
Thank you for the feedback. Actually we have such machines and this was reported as bug #46029. Setting as duplicate.