MySQL Bugs: #97593: mysql 5.6 crashes at fill_trx

Bug #97593	mysql 5.6 crashes at fill_trx_row
Submitted:	12 Nov 2019 9:19	Modified:	16 Dec 2019 13:54
Reporter:	jeff.ding jeff.ding	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	5.6	OS:	CentOS
Assigned to:		CPU Architecture:	x86

Description:
mysql 5.6 crashes at fill_trx_row when query trx state from information_schema,query string is like: 

"SELECT trx_state, count(*) as count FROM information_schema.innodb_trx             WHERE trx_started <  date_sub(now(), interval 1 second) group by trx_state"

How to repeat:
query trx state from information_schema, the query string is like 

"SELECT trx_state, count(*) as count FROM information_schema.innodb_trx             WHERE trx_started <  date_sub(now(), interval 1 second) group by trx_state"

This will crash mysql server occasionally

Suggested fix:
function 'fetch_data_into_cache_low' should call trx_mutex_enter(trx) before 'add_trx_relevant_locks_to_cache'. I find this issue is fixed in mysql 8.0

Hi Mr. jeff.ding,

Thank you for your bug report.

However, we need additional info from you. First, I have run your query 10 (ten) times and it never crashed our server running latest 5.6.

Hence, the questions:

* Are you using latest available 5.6 and if not, please try that one, first of all

* What are the exact conditions under which you can get the crash. We need a fully repeatable test case in order to verify this report. That means a test case that would ALWAYS crash a server.

Hi, Sinisa Milivojevic

This crash only one time, but got the core stack:

#0  0x00007fe395bc79b1 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000006aff8e in handle_fatal_signal (sig=6)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/signal_handler.cc:321
#2  <signal handler called>
#3  0x00007fe3949d91f7 in raise () from /lib64/libc.so.6
#4  0x00007fe3949da8e8 in abort () from /lib64/libc.so.6
#5  0x0000000000af9af3 in fill_trx_row (row=0x7fc9645f8af8, trx=trx@entry=0x7fc24f7fc168, requested_lock_row=requested_lock_row@entry=0x0,
    cache=cache@entry=0x156ac00 <trx_i_s_cache_static>)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/storage/innobase/trx/trx0i_s.cc:488
#6  0x0000000000afa78b in fetch_data_into_cache_low (cache=cache@entry=0x156ac00 <trx_i_s_cache_static>, trx_list=<optimized out>,
    read_write=true) at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/storage/innobase/trx/trx0i_s.cc:1331
#7  0x0000000000afae30 in fetch_data_into_cache (cache=0x156ac00 <trx_i_s_cache_static>)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/storage/innobase/trx/trx0i_s.cc:1357
#8  trx_i_s_possibly_fetch_data_into_cache (cache=cache@entry=0x156ac00 <trx_i_s_cache_static>)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/storage/innobase/trx/trx0i_s.cc:1386
#9  0x0000000000a16eb5 in trx_i_s_common_fill_table (thd=thd@entry=0x7fc5023f0000, tables=tables@entry=0x7fc78656d6a0)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/storage/innobase/handler/i_s.cc:1340
#10 0x0000000000795052 in do_fill_table (join_table=0x7fc4e86120b8, table_list=0x7fc78656d6a0, thd=0x7fc5023f0000)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_show.cc:8012
#11 get_schema_tables_result (join=join@entry=0x7fc3a7dea1d0, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_show.cc:8113
#12 0x0000000000773745 in JOIN::prepare_result (this=this@entry=0x7fc3a7dea1d0, columns_list=columns_list@entry=0x7fc87c5e5370)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_select.cc:824
#13 0x0000000000719f87 in JOIN::exec (this=0x7fc3a7dea1d0)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_executor.cc:116
#14 0x000000000077440d in mysql_execute_select (free_join=true, select_lex=0x7fc5023f2c80, thd=0x7fc5023f0000)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_select.cc:1114
#15 mysql_select (thd=thd@entry=0x7fc5023f0000, tables=0x7fc78656d6a0, wild_num=0, fields=..., conds=0x7fc78656e058,
    order=order@entry=0x7fc5023f2e50, group=group@entry=0x7fc5023f2d88, having=0x0, select_options=2684619520,
    result=result@entry=0x7fc3a7dea1a8, unit=unit@entry=0x7fc5023f2620, select_lex=select_lex@entry=0x7fc5023f2c80)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_select.cc:1235
#16 0x0000000000774665 in handle_select (thd=thd@entry=0x7fc5023f0000, result=result@entry=0x7fc3a7dea1a8,
    setup_tables_done_option=setup_tables_done_option@entry=0)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_select.cc:111
#17 0x0000000000597518 in execute_sqlcom_select (thd=thd@entry=0x7fc5023f0000, all_tables=<optimized out>)
    at /home/admin/35_20191015101601795_107527411_code/rpm_workspace/sql/sql_parse.cc:6188

At the time of this crash, trx->lock.wait_lock(0x7fc8deb583c8) is not nullptr while requested_lock_row is 0x0(this means trx->lock.que_state is TRX_QUE_RUNNING). So I checked this and found trx->lock.wait_lock and  trx->lock.que_state are modified under trx->mutex(please see function lock_cancel_waiting_and_release). When we query trx state from information_schema, fetch_data_into_cache_low access these two variables with no trx->mutex acquired, make the server crashes, i think.

I checked fetch_data_into_cache_low in mysql 8.0, fetch_data_into_cache_low had called trx_mutex_enter(trx) before add_trx_relevant_locks_to_cache, please check this :)

Hi Mr. ding,

Please re-read and answer to the questions that I asked in my previous comment.

We simply have to make sure that you use the latest release and if you do, we must be able to repeat the behaviour.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".