Bug #75199 | MySQL crashed because of flush_all | ||
---|---|---|---|
Submitted: | 13 Dec 2014 7:15 | Modified: | 6 Aug 2015 17:08 |
Reporter: | zhai weixiang (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Memcached | Severity: | S3 (Non-critical) |
Version: | 5.7.5, 5.7.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | memcached |
[13 Dec 2014 7:15]
zhai weixiang
[22 Dec 2014 6:22]
zhai weixiang
My configure file: [mysqld_safe] pid-file=/u01/my575/run/mysqld.pid #malloc-lib=/u01/mysql/lib/libjemalloc.so [mysql] port=13406 prompt=\\u@\\d \\r:\\m:\\s> default-character-set=utf8 no-auto-rehash skip-insecure-warning [client] port=13406 socket=/u01/my575/run/mysql.sock [mysqld] #dir basedir=/u01/my575 datadir=/u01/my575/data tmpdir=/u01/my575/tmp lc_messages_dir=/u01/my575/share log-error=/u01/my575/log/alert.log slow_query_log_file=/u01/my575/log/slow.log general_log_file=/u01/my575/log/general.log socket=/u01/my575/run/mysql.sock core-file innodb_doublewrite=1 innodb_stats_persistent=0 innodb_data_home_dir=/u01/my575/data innodb_log_group_home_dir=/u01/my575/data innodb_data_file_path = ibdata1:4G;ibdata2:16M:autoextend innodb_buffer_pool_size=80G innodb_buffer_pool_instances=8 innodb_log_files_in_group=4 innodb_log_file_size=1G innodb_log_buffer_size=80M innodb_flush_log_at_trx_commit=2 innodb_max_dirty_pages_pct_lwm=10 innodb_max_dirty_pages_pct=60 innodb_thread_concurrency=32 innodb_read_io_threads=8 innodb_write_io_threads=8 innodb_open_files=60000 innodb_file_format=Barracuda innodb_file_per_table=1 innodb_flush_method=O_DIRECT innodb_change_buffering=all innodb_adaptive_flushing=ON innodb_flush_neighbors=0 innodb_adaptive_hash_index=OFF innodb_old_blocks_time=1000 innodb_stats_on_metadata=ON innodb_read_ahead_threshold=0 innodb_use_native_aio=1 innodb_lock_wait_timeout=5 innodb_rollback_on_timeout=0 innodb_purge_threads=1 innodb_strict_mode=1 transaction-isolation=READ-COMMITTED innodb_disable_sort_file_cache=ON innodb_io_capacity_max=6000 innodb_io_capacity=2000 innodb_lru_scan_depth=8194 innodb_undo_tablespaces=20 metadata_locks_hash_instances=256 metadata_locks_cache_size=32768 innodb_sync_array_size=16 innodb_print_all_deadlocks innodb_checksum_algorithm=CRC32 #myisam myisam_sort_buffer_size=64M concurrent_insert=2 delayed_insert_timeout=300 #binlog log-bin=/u01/my575/log/mysql-bin server_id=3017441834 binlog_cache_size=32K max_binlog_cache_size=2G max_binlog_size=500M binlog-format=ROW sync_binlog=1000 log-slave-updates=1 expire_logs_days=0 port=13406 skip-name-resolve skip-ssl max_connections=21000 max_user_connections=20200 max_connect_errors=65536 max_allowed_packet=128M connect_timeout=8 net_read_timeout=30 net_write_timeout=60 back_log=1024 default-storage-engine=INNODB character-set-server=utf8 lower_case_table_names=1 skip-external-locking open_files_limit=65536 safe-user-create local-infile=1 sql_mode="STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE" performance_schema=ON performance_schema_instrument='%=ON' log_slow_admin_statements=1 long_query_time=1 slow_query_log=1 general_log=0 log_error_verbosity=3 query_cache_type=0 query_cache_limit=1M query_cache_min_res_unit=1K table_definition_cache=2000 table_open_cache_instances = 128 table_open_cache=16000 thread_stack=512K thread_cache_size=256 read_rnd_buffer_size=128K sort_buffer_size=256K join_buffer_size=128K read_buffer_size=128K #replication master-info-file=/u01/my575/log/master.info relay-log=/u01/my575/log/relaylog relay_log_info_file=/u01/my575/log/relay-log.info relay-log-index=/u01/my575/log/mysqld-relay-bin.index slave_load_tmpdir=/u01/my575/tmp slave_type_conversions="ALL_NON_LOSSY" slave_net_timeout=4 skip-slave-start sync_master_info=1000 sync_relay_log_info=1000 rpl_stop_slave_timeout=300 slave_checkpoint_group=1024 slave_checkpoint_period=300 slave_parallel_workers=8 #slave_pr_mode=TABLE slave_pending_jobs_size_max=1073741824 slave_rows_search_algorithms='TABLE_SCAN,INDEX_SCAN,HASH_SCAN' slave_sql_verify_checksum=OFF master_verify_checksum=OFF gtid_mode=OFF enforce_gtid_consistency=OFF master_info_repository=TABLE relay_log_info_repository=TABLE #memcached plugin daemon_memcached_option='-p13407' daemon_memcached_enable_binlog=1 innodb_api_enable_binlog=1 innodb_api_enable_mdl=1 daemon_memcached_r_batch_size=5 daemon_memcached_w_batch_size=5
[22 Dec 2014 6:26]
zhai weixiang
Please ignore skip-insecure-warning :)
[20 Jan 2015 8:14]
MySQL Verification Team
Hello Zhai Weixiang , Thank you for the report. I could not reproduce this issue at my end with provided conf file, and with test case on latest trunk build(5.7.6) . // // mysql cli session mysql> select * from demo_test; +-------+------+------+------+------+ | c1 | c2 | c3 | c4 | c5 | +-------+------+------+------+------+ | test1 | t1 | 10 | 3 | 0 | +-------+------+------+------+------+ 1 row in set (0.00 sec) mysql> delete from demo_test; Query OK, 1 row affected (0.01 sec) [root@cluster-repo ~]# telnet x.x.x.x 13407 Trying x.x.x.x... Connected to x.x.x.x. Escape character is '^]'. set test1 10 0 2 t1 STORED flush_all OK set test1 10 0 2 t1 STORED flush_all OK flush_all OK flush_all OK flush_all OK Could you please tell us about your build whether it is official build from http://dev.mysql.com/downloads/mysql/ or you compiled yourself? if yes, then exact cmake build details to help us repeat this issue? Thanks, Umesh
[20 Jan 2015 8:16]
MySQL Verification Team
Also, you mentioned default settings but still could you please provide below query output? Just want to ensure whether data written through the memcached on disk (innodb_only, the default); to store the data in memory only, as in the traditional memcached (cache-only); or both (caching). select * from innodb_memcache.cache_policies;
[20 Jan 2015 9:38]
zhai weixiang
root@innodb_memcache 05:37:38>select * from cache_policies; +--------------+-------------+-------------+---------------+--------------+ | policy_name | get_policy | set_policy | delete_policy | flush_policy | +--------------+-------------+-------------+---------------+--------------+ | cache_policy | innodb_only | innodb_only | innodb_only | innodb_only | +--------------+-------------+-------------+---------------+--------------+ 1 row in set (0.00 sec) root@innodb_memcache 05:37:44>select * from config_options; +---------------------+-------+ | name | value | +---------------------+-------+ | separator | | | | table_map_delimiter | . | +---------------------+-------+ 2 rows in set (0.00 sec) root@innodb_memcache 05:37:49>select * from containers; +------+-----------+-----------+-------------+---------------+-------+------------+--------------------+------------------------+ | name | db_schema | db_table | key_columns | value_columns | flags | cas_column | expire_time_column | unique_idx_name_on_key | +------+-----------+-----------+-------------+---------------+-------+------------+--------------------+------------------------+ | aaa | test | demo_test | c1 | c2 | c3 | c4 | c5 | PRIMARY | +------+-----------+-----------+-------------+---------------+-------+------------+--------------------+------------------------+ 1 row in set (0.00 sec) cmake . \ -DSYSCONFDIR="$dest_dir" \ -DCMAKE_INSTALL_PREFIX="$dest_dir" \ -DMYSQL_DATADIR="$dest_dir/data" \ -DENABLED_PROFILING=1 \ -DWITH_EXTRA_CHARSETS=all \ -DDEFAULT_CHARSET=utf8 \ -DDEFAULT_COLLATION=utf8_general_ci \ -DWITH_SSL=bundled \ -DWITH_ZLIB=bundled \ -DWITH_PARTITION_STORAGE_ENGINE=1 \ -DWITH_INNOBASE_STORAGE_ENGINE=1 \ -DWITH_ARCHIVE_STORAGE_ENGINE=1 \ -DWITH_BLACKHOLE_STORAGE_ENGINE=1 \ -DWITH_PERFSCHEMA_STORAGE_ENGINE=1 \ -DENABLED_LOCAL_INFILE=1 \ -DWITH_EMBEDDED_SERVER=0 \ -DINSTALL_LAYOUT=STANDALONE \ -DCOMMUNITY_BUILD=1 -DWITH_INNODB_MEMCACHED=ON \ -DMYSQL_SERVER_SUFFIX="$server_suffix" -DWITH_BOOST=../
[20 Jan 2015 9:39]
zhai weixiang
I can still repeat the crash.. Let me know if you need more information :)
[20 Jan 2015 12:44]
MySQL Verification Team
Thank you for the details, Zhai. I tried it but with no luck(will share my steps just in case am missing anything).. so, will check internally and shall get back to you if more info required.
[20 Jan 2015 12:44]
MySQL Verification Team
test results
Attachment: 75199.txt (text/plain), 244.46 KiB.
[20 Jan 2015 15:24]
zhai weixiang
Funny ! I can't repeat the crash if debug is enabled!!
[20 Jan 2015 18:52]
MySQL Verification Team
it appears you cannot run "flush_all" concurrently with any selects, else you end up with missing table and crash: InnoDB: Error: trying to access tablespace 3891 page no. 3, InnoDB: but the tablespace does not exist or is just being dropped. InnoDB: Error: trying to access tablespace 3891 page no. 3, InnoDB: but the tablespace does not exist or is just being dropped. InnoDB: Error: Unable to read tablespace 3891 page no 3 into the buffer pool after 100 attempts InnoDB: Assertion failure in thread 140736594200320 in file buf0buf.cc line 2646
[21 Jan 2015 1:55]
zhai weixiang
Aha, Shane, It seems you found another bug !! :-)
[9 Apr 2015 14:07]
MySQL Verification Team
Thank you Shane, Zhai. I could not repeat this with 5.7.7. Also, surely fixed after Bug #72435 and per change log: When calling the "memcached" "flush_all" command, "InnoDB" attempts to initialize a connection and a transaction. If the transaction is in "TRX_STATE_NOT_STARTED" state, "InnoDB" would fail to set "CONN_DATA->CRSR_TRX" to NULL, resulting in a serious error. If you can provide more information, feel free to add it to this bug and change the status back to 'Open'. Thanks, Umesh
[16 Apr 2015 8:09]
zhai weixiang
Hi, Umesh, I guess I found the root cause. It seems assert is not defined on my platform. I think this is depending on the behavior of compiling order. I removed the assert and eveything works well. diff --git a/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c b/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c index 46fce0c..30c2490 100644 --- a/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c +++ b/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c @@ -810,9 +810,9 @@ have_conn: } else { /* Write cursor transaction exists. Reuse this transaction.*/ - assert(ib_cb_trx_start(conn_data->crsr_trx, + ib_cb_trx_start(conn_data->crsr_trx, engine->trx_level, - true, false, NULL)); + true, false, NULL); } err = innodb_api_begin(
[16 Apr 2015 8:10]
zhai weixiang
oh, oh ...only two option: Can't repeat and Closed ..... How can I reopen this bug ?
[16 Apr 2015 8:19]
MySQL Verification Team
Hello Zhai, I think you can change status to "open" from the status select box. If I understood correctly, this is no longer repeatable now after removing assert. Please feel free to add info which would help us trigger this issue in newer builds. or you agree to close the bug for now? Please let us know. Thanks, Umesh
[16 Apr 2015 10:16]
zhai weixiang
I have run the case on two production machine and both crashed. But since you can't repeat the crash , I believe this is platform specified bug. The root cause is that assert is not defined.so the function ib_cb_trx_start inside assert is not called which should start the trx and set trx->state to active. Quoted from the manual page : http://man7.org/linux/man-pages/man3/assert.3.html If the macro NDEBUG was defined at the moment <assert.h> was last included, the macro assert() generates no code, and hence does nothing at all. Otherwise, the macro assert() prints an error message to standard error and terminates the program by calling abort(3) if expression is false (i.e., compares equal to zero). The purpose of this macro is to help programmers find bugs in their programs. The message "assertion failed in file foo.c, function do_bar(), line 1287" is of no help at all to a user. Besides, generally we should call the function first and then assert the return value. :) To help you repeat the case, a little more information: $uname -a Linux $HOST 2.6.32-220.23.2.ali878.el6.x86_64 #1 SMP Mon Jan 28 17:12:52 CST 2013 x86_64 x86_64 x86_64 GNU/Linux Compile flagļ¼ COMMON_FLAGS="-O3 -g -fexceptions -fno-omit-frame-pointer -fno-strict-aliasing" CFLAGS="$COMMON_FLAGS" CXXFLAGS="$COMMON_FLAGS" CC=/opt/rh/devtoolset-2/root/usr/bin/gcc CXX=/opt/rh/devtoolset-2/root/usr/bin/g++ export CC CFLAGS CXX CXXFLAGS gcc/g++ version opt/rh/devtoolset-2/root/usr/bin/gcc --version gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) $/opt/rh/devtoolset-2/root/usr/bin/g++ --version g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
[11 Jun 2015 13:25]
MySQL Verification Team
Thank you Zhai. Verified as described. Thanks, Umesh
[11 Jun 2015 13:36]
MySQL Verification Team
Confirmed 5.7.7 build is affected
[11 Jun 2015 13:37]
MySQL Verification Team
test results
Attachment: 75199.txt (text/plain), 291.29 KiB.
[15 Jul 2015 9:29]
ADITYA ANANTHAPADMANABHA
Hi, The crash reproduced by umesh is a different bug ,caused by a bogus assertion which is fixed as part of Bug#75790. So the issue reported still not reproducible.
[17 Jul 2015 1:59]
zhai weixiang
Hi, Aditya, I can always repeat the crash .... OK, Let's put aside the test case. But according to the manual, the behavior of assert function is not fixed, I think calling a important function that may affect the code logic inside assert() is not a correct code style.
[17 Jul 2015 4:25]
ADITYA ANANTHAPADMANABHA
I completely agree with you. Will make a patch to remove the function from assert and assert on return value .
[6 Aug 2015 17:08]
Daniel Price
Posted by developer: Fixed as of the upcoming 5.6.27, 5.7.9, 5.8.0 releases, and here's the changelog entry: A memcached flush_all command raised an assertion. A function that starts a transaction was called from within assertion code. Thank you for the bug report.