| Bug #75782 | Failing assertion: trx->read_only && trx->auto_commit && trx->isolation_level==1 | ||
|---|---|---|---|
| Submitted: | 5 Feb 2015 3:23 | Modified: | 20 May 2015 18:32 |
| Reporter: | Roel Van de Paar | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: InnoDB storage engine | Severity: | S6 (Debug Builds) |
| Version: | 5.7.5-m15 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[5 Feb 2015 3:23]
Roel Van de Paar
Stopping the while() loop a few times with CTRL+C and restarting may also aid reproduction.
[5 Feb 2015 3:31]
Roel Van de Paar
Tarball with all scripts required
Attachment: bug_75782.tar.gz (application/gzip, text), 93.78 KiB.
[9 Feb 2015 20:27]
Sveta Smirnova
Thank you for the report.
Verified as described.
To repeat report I had to modify paths in the scripts provided to reflect them my environment, change export LD_LIBRARY_PATH=$(cat ${SCRIPT_PWD}/1422600023_mybase)/lib to export LD_LIBRARY_PATH=$(cat ${SCRIPT_PWD}/1422600023_mybase)/lib:$LD_LIBRARY_PATH in 1422600023_run_pquery , so my custom LD_LIBRARY_PATH still works and re-compile pquery, taken from http://bazaar.launchpad.net/~percona-core/percona-qa/trunk/files
Also I had to run test for 3 hours before crash occurred.
[9 Feb 2015 20:30]
Sveta Smirnova
Posted by developer: I have different back trace: 2015-02-09 21:19:27 0x50c39940 InnoDB: Assertion failure in thread 1354996032 in file ha_innodb.cc line 3537 InnoDB: Failing assertion: !trx_is_started(trx) InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 20:19:27 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. Attempting to collect some information that could help diagnose the problem. As this is a crash and something is definitely wrong, the information collection process might fail. key_buffer_size=8388608 read_buffer_size=131072 max_used_connections=152 max_threads=151 thread_count=152 connection_count=152 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68128 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x2aaab83c6000 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 50c39010 thread_stack 0x40000 /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(my_print_stacktrace+0x35)[0x13c15e0] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(handle_fatal_signal+0x405)[0xbc523c] /lib64/libpthread.so.0[0x2aca340aebe0] /lib64/libc.so.6(gsignal+0x35)[0x2aca3530b285] /lib64/libc.so.6(abort+0x110)[0x2aca3530cd30] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld[0x15df5f7] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld[0x13e8fc5] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld[0xc25f4e] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_PP13st_plugin_intPvEijS4_+0x333)[0x10e9fc5] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z28ha_start_consistent_snapshotP3THD+0x34)[0xc25f90] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z11trans_beginP3THDj+0x3d0)[0x11b1383] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z21mysql_execute_commandP3THD+0x4da3)[0x10be064] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x54a)[0x10c16db] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcm+0xfd8)[0x10b7045] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(_Z10do_commandP3THD+0x362)[0x10b5b57] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(handle_connection+0x1b9)[0x11db03c] /home/ssmirnov/blade12/build/mysql-trunk/bin/mysqld(pfs_spawn_thread+0x15a)[0x175019c] /lib64/libpthread.so.0[0x2aca340a677d] /lib64/libc.so.6(clone+0x6d)[0x2aca353aec1d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (2aaab83cf020): START TRANSACTION WITH CONSISTENT SNAPSHOT Connection ID (thread ID): 83905 Status: NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. Writing a core file
[19 Feb 2015 7:11]
Naga Satyanarayana Bodapati
Posted by developer: From the bug reporter stack trace, it looks like 1. attachable trx is started 2. trx->is_dd_trx is set 3. trx returns to pool without clearing the is_dd_trx flag <--- Bug probably here 4. Some other thread(dict stats thread in this case) requesting trx, got the trx which is earlier used for attachable trx 5. Assertion raised because of left over flag trx->is_dd_trx
[19 Feb 2015 7:32]
Naga Satyanarayana Bodapati
Posted by developer:
I was wrong, We do have a assertion to make sure trx->is_dd_trx is set to false before returning the trx to pool.
in
trx_free(trx_t*& trx)
{
...
ut_ad(trx->is_dd_trx == false);
...
}
So there is some mystery around this
[23 Feb 2015 9:46]
Naga Satyanarayana Bodapati
On Ctrl-C pquery is crashing frequently :) (gdb) bt #0 0x00007f5b490d4cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f5b490d80d8 in __GI_abort () at abort.c:89 #2 0x00007f5b49111f24 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f5b492206c8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007f5b4911e1fe in malloc_printerr (ptr=<optimized out>, str=0x7f5b492207b0 "double free or corruption (!prev)", action=1) at malloc.c:4996 #4 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840 #5 0x00007f5b49cd6a3b in my_once_free () from /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18 #6 0x00007f5b49cd5ebd in my_end () from /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18 #7 0x00007f5b49cb1f07 in mysql_server_end () from /usr/lib/x86_64-linux-gnu/libmysqlclient.so.18 #8 0x0000000000403674 in executor (number=261, qlist=...) at pquery.cpp:83 #9 0x00007f5b49a31bf0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #10 0x00007f5b48e88182 in start_thread (arg=0x7f5a4672c700) at pthread_create.c:312 #11 0x00007f5b4919900d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
[24 Feb 2015 12:42]
Naga Satyanarayana Bodapati
Posted by developer: Am not able to reproduce :(
[24 Feb 2015 12:45]
Naga Satyanarayana Bodapati
I have few questions to the bug reporter: 1) Is it possible to compile pquery with gcc-4.4? 2) Is it possible to fix the pquery crash(not important, it may help) 3) Is it possible to get minimum sequence of queries in .sql to reproduce the crash? (similar to RQG simplifier)
[24 Feb 2015 12:46]
Naga Satyanarayana Bodapati
Setting the status to Need Feedback to get attention
[24 Feb 2015 18:19]
Sveta Smirnova
Naga, I waited 3 hours for crash to occur. Regarding to q.1 - yes, it is possible: just download sources from Launchpad repo, modify paths and compile the tool. Building is not automated, but it is possible. Ping me if you need help.
[25 Feb 2015 5:41]
Naga Satyanarayana Bodapati
Sveta, gcc-4.4 doesn't support the option -std=gnu++11 I've tried on my laptop and pquery completes its run successfully. At the end it prints summary Also, the assert you have got is different from the assert reported by bug reporter.
[25 Mar 2015 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[27 Mar 2015 5:46]
Naga Satyanarayana Bodapati
Using the sql provided and pumping that SQL using multiple threads, we are able to reproduce the bug.
[27 Mar 2015 6:13]
Roel Van de Paar
Very nice Naga! Very appreciated too. We are in the process of building 3 new pquery binaries (oracle/maria/percona), so there is no more lib issues (and hopefully no more of the issues you saw). More on this on the Percona blog when it's released.
[1 Apr 2015 0:21]
Roel Van de Paar
Small update, not directly related to bug, pquery binaries are as good as ready, and we also resolved a memory bug, which may have been the one you were seeing. So hopefully reproduction issues like the one seen in this bug are a thing of the past.
[20 May 2015 18:32]
Daniel Price
Posted by developer: Fixed as of the upcoming 5.7.8, 5.8.0 releases, and here's the changelog entry: Attempting to kill statements that use attachable transactions caused subsequent statements to function improperly, resulting in assertion failures. A "max_statement_time" timeout on such statements could produce a similar outcome. Thanks for the bug report.

Description: 2015-02-05 14:07:57 0x7f76274ac700 InnoDB: Assertion failure in thread 140145442080512 in file trx0trx.cc line 1270 InnoDB: Failing assertion: trx->read_only && trx->auto_commit && trx->isolation_level == 1 [...] (gdb) bt #0 0x00007fe20efbd771 in pthread_kill () from /lib64/libpthread.so.0 #1 0x0000000000e15d58 in my_write_core (sig=6) at /bzr/mysql-5.7-percona-patches_dbg/mysys/stacktrace.c:247 #2 0x0000000000822044 in handle_fatal_signal (sig=6) at /bzr/mysql-5.7-percona-patches_dbg/sql/signal_handler.cc:219 #3 <signal handler called> #4 0x00007fe20dbc15c9 in raise () from /lib64/libc.so.6 #5 0x00007fe20dbc2cd8 in abort () from /lib64/libc.so.6 #6 0x00000000010d4ccf in ut_dbg_assertion_failed (expr=0x14e8448 "trx->read_only && trx->auto_commit && trx->isolation_level == 1", file=0x14e7870 "/bzr/mysql-5.7-percona-patches_dbg/storage/innobase/trx/trx0trx.cc", line=1270) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/ut/ut0dbg.cc:68 #7 0x00000000010c1818 in trx_start_low (trx=0x7fe1e9c00c90, read_write=true) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/trx/trx0trx.cc:1269 #8 0x00000000010c55e0 in trx_start_internal_low (trx=0x7fe1e9c00c90) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/trx/trx0trx.cc:3012 #9 0x0000000001197483 in dict_stats_fetch_from_ps (table=0x7fe1d711d598) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/dict/dict0stats.cc:2919 #10 0x0000000001197cb1 in dict_stats_update (table=0x7fe1ef3b0c98, stats_upd_option=DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/dict/dict0stats.cc:3201 #11 0x0000000000f084ae in dict_stats_init (table=0x7fe1ef3b0c98) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/include/dict0stats.ic:173 #12 0x0000000000f0e859 in ha_innobase::open (this=0x7fe1d5ede420, name=0x7fe1d72c9390 "./test/t3", mode=2, test_if_locked=2) at /bzr/mysql-5.7-percona-patches_dbg/storage/innobase/handler/ha_innodb.cc:5098 #13 0x0000000000880baf in handler::ha_open (this=0x7fe1d5ede420, table_arg=0x7fe1d721ec10, name=0x7fe1d72c9390 "./test/t3", mode=2, test_if_locked=2) at /bzr/mysql-5.7-percona-patches_dbg/sql/handler.cc:2471 #14 0x0000000000c179e0 in open_table_from_share (thd=0x7fe1d701c000, share=0x7fe1d72c9020, alias=0x7fe1d7028ae8 "t3", db_stat=39, prgflag=44, ha_open_flags=0, outparam=0x7fe1d721ec10, is_create_table=false) at /bzr/mysql-5.7-percona-patches_dbg/sql/table.cc:2378 #15 0x0000000000acafb7 in open_table (thd=0x7fe1d701c000, table_list=0x7fe1d734e020, ot_ctx=0x7fe20f572a40) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_base.cc:3417 #16 0x0000000000acd770 in open_and_process_table (thd=0x7fe1d701c000, lex=0x7fe1d701e370, tables=0x7fe1d734e020, counter=0x7fe1d701e420, flags=0, prelocking_strategy=0x7fe20f572b40, has_prelocking_list=false, ot_ctx=0x7fe20f572a40) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_base.cc:4903 #17 0x0000000000ace4fe in open_tables (thd=0x7fe1d701c000, start=0x7fe20f572b00, counter=0x7fe1d701e420, flags=0, prelocking_strategy=0x7fe20f572b40) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_base.cc:5357 #18 0x0000000000acf745 in open_normal_and_derived_tables (thd=0x7fe1d701c000, tables=0x7fe1d7028558, flags=0) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_base.cc:6085 #19 0x0000000000b4ecca in execute_sqlcom_select (thd=0x7fe1d701c000, all_tables=0x7fe1d7028558) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_parse.cc:5004 #20 0x0000000000b47775 in mysql_execute_command (thd=0x7fe1d701c000) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_parse.cc:2488 #21 0x0000000000b4fcc1 in mysql_parse (thd=0x7fe1d701c000, parser_state=0x7fe20f573e60) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_parse.cc:5427 #22 0x0000000000b44814 in dispatch_command (command=COM_QUERY, thd=0x7fe1d701c000, packet=0x7fe1d7033011 "SELECT t1.c1,t3.c1 FROM t1 NATURAL RIGHT JOIN t3 WHERE t1.c1 <=> 5;", packet_length=68) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_parse.cc:1250 #23 0x0000000000b4354d in do_command (thd=0x7fe1d701c000) at /bzr/mysql-5.7-percona-patches_dbg/sql/sql_parse.cc:834 #24 0x0000000000c4d49e in handle_connection (arg=0x7fe1f0fffee0) at /bzr/mysql-5.7-percona-patches_dbg/sql/conn_handler/connection_handler_per_thread.cc:298 #25 0x0000000000e39538 in pfs_spawn_thread (arg=0x7fe1f17f70d0) at /bzr/mysql-5.7-percona-patches_dbg/storage/perfschema/pfs.cc:2137 #26 0x00007fe20efb8df3 in start_thread () from /lib64/libpthread.so.0 #27 0x00007fe20dc821ad in clone () from /lib64/libc.so.6 How to repeat: This bug is extremely hard to reproduce. The attached scripts will make it semi-easy, but expect to see some hacky methods employed (you'll see lot of unrelated errors both in the error log and in pquery output...) $ vi {epoch}_mybase # Update base path in this file (the only change required!) $ ./{epoch}_init # Initializes the data dir $ ./{epoch}_start # Starts mysqld $ ./{epoch}_cl # To check mysqld is up $ while(true); do ./1422600023_run_pquery; done # This runs the testcase in a loop. Expect to see lots of unrelated errors (Error 1040: Too many connections/double free or corruption/trying to destroy uninialized mutex/invalid next size/segmentation faults/Aborts/... and so on). Ignore them all. ** Watch/wait ** for the occurence of "Error 2002: Can't connect to local MySQL server through socket '/dev/shm/1422600023/socket.sock' (111)" - maybe quickly, maybe after a while - it will scroll very fast over the screen (much faster then other original errors): =================== Error 2002: Can't connect to local MySQL server through socket '/dev/shm/1422600023/socket.sock' (111) * PQUERY: Unable to continue [2], exiting Executing testcase ./1422600023.sql against mysqld with socket /dev/shm/1422600023/socket.sock using pquery... Infile is ./1422600023.sql Database is test Starting with 300 threads Query limit per thread is 100000 User is root Socket is /dev/shm/1422600023/socket.sock Error 2002: Can't connect to local MySQL server through socket '/dev/shm/1422600023/socket.sock' (111) =================== - This is your sign that the mysqld server is gone/has crashed. Press CTRL+C and review error log/gdb to confirm. $ vi /dev/shm/{epoch}/error.log.out # Verify the error log $ ./{epoch}_gdb # Brings you to a gdb prompt attached to correct mysqld & generated core $ ./{epoch}_parse_core # Create {epoch}_STD.gdb and {epoch}_FULL.gdb; standard and full var gdb stack traces etc.