Bug #74775 buf_block_align relies on random timeouts, volatile rather than barriers
Submitted: 11 Nov 2014 4:42 Modified: 13 Dec 2018 3:07
Reporter: Stewart Smith Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7.7 OS:Linux
Assigned to: Bogdan Kecman CPU Architecture:ARM

[11 Nov 2014 4:42] Stewart Smith
Description:
While testing MySQL 5.7.5 on POWER8, I can reliably get the following crash while running:
./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10

#6  0x0000000010230928 in ut_dbg_assertion_failed (expr=0x10d25580 "counter < 10", file=0x10d25058 "/home/stewart/mysql-5.7.5-m15/storage/innobase/buf/buf0buf.cc", line=3582) at /home/stewart/mysql-5.7.5-m15/storage/innobase/ut/ut0dbg.cc:68
#7  0x00000000109d6400 in buf_block_align (ptr=0x3fff84ad4d34 "\200") at /home/stewart/mysql-5.7.5-m15/storage/innobase/buf/buf0buf.cc:3582
#8  0x00000000109c47dc in btr_search_guess_on_hash (index=0x3fff54005250, info=0x3fff540536a8, tuple=0x3fff54050f70, mode=2, latch_mode=1, cursor=0x3fff54050c38, has_search_latch=0, mtr=0x3fff840abf70) at /home/stewart/mysql-5.7.5-m15/storage/innobase/btr/btr0sea.cc:933
#9  0x00000000109bd6b4 in btr_cur_search_to_nth_level (index=0x3fff54005250, level=0, tuple=0x3fff54050f70, mode=2, latch_mode=1, cursor=0x3fff54050c38, has_search_latch=0, file=0x10d16a50 "/home/stewart/mysql-5.7.5-m15/storage/innobase/row/row0sel.cc", line=4769, mtr=0x3fff840abf70) at /home/stewart/mysql-5.7.5-m15/storage/innobase/btr/btr0cur.cc:913
#10 0x000000001022da5c in btr_pcur_open_with_no_init_func (index=<optimized out>, index@entry=0x3fff54005250, tuple=<optimized out>, tuple@entry=0x3fff54050f70, mode=<optimized out>, mode@entry=2, cursor=cursor@entry=0x3fff54050c38, has_search_latch=has_search_latch@entry=0, line=<optimized out>, line@entry=4769, mtr=<optimized out>, mtr@entry=0x3fff840abf70, file=0x10d16a50 "/home/stewart/mysql-5.7.5-m15/storage/innobase/row/row0sel.cc", latch_mode=1) at /home/stewart/mysql-5.7.5-m15/storage/innobase/include/btr0pcur.ic:510
#11 0x00000000109132c8 in row_search_mvcc (buf=0x3fff54024260 "\376\067\001", mode=2, prebuilt=0x3fff54050bc0, match_mode=1, direction=<optimized out>) at /home/stewart/mysql-5.7.5-m15/storage/innobase/row/row0sel.cc:4769
#12 0x00000000108269a4 in ha_innobase::index_read (this=0x3fff5404fe10, buf=0x3fff54024260 "\376\067\001", key_ptr=0x3fff54050358 "8\001", key_len=<optimized out>, find_flag=<optimized out>) at /home/stewart/mysql-5.7.5-m15/storage/innobase/handler/ha_innodb.cc:7733
#13 0x000000001081cb18 in ha_innobase::rnd_pos (this=0x3fff5404fe10, buf=0x3fff54024260 "\376\067\001", pos=0x3fff54050358 "8\001") at /home/stewart/mysql-5.7.5-m15/storage/innobase/handler/ha_innodb.cc:8310
#14 0x00000000102b6a60 in handler::ha_rnd_pos (this=<optimized out>, buf=<optimized out>, pos=<optimized out>) at /home/stewart/mysql-5.7.5-m15/sql/handler.cc:2681
#15 0x00000000104d01fc in rr_from_tempfile (info=0x3fff5c00e0b0) at /home/stewart/mysql-5.7.5-m15/sql/records.cc:530
#16 0x00000000105460dc in sub_select (join=0x3fff5c00d7a0, qep_tab=0x3fff5c00e060, end_of_records=<optimized out>) at /home/stewart/mysql-5.7.5-m15/sql/sql_executor.cc:1252
#17 0x000000001053c36c in do_select (join=0x3fff5c00d7a0) at /home/stewart/mysql-5.7.5-m15/sql/sql_executor.cc:922
#18 JOIN::exec (this=0x3fff5c00d7a0) at /home/stewart/mysql-5.7.5-m15/sql/sql_executor.cc:190
#19 0x00000000105a9638 in mysql_select (thd=thd@entry=0x3fff5c000b80, fields=..., select_options=<optimized out>, result=result@entry=0x3fff5c006908, select_lex=select_lex@entry=0x3fff5c005270) at /home/stewart/mysql-5.7.5-m15/sql/sql_select.cc:1055
#20 0x00000000105a97cc in handle_select (thd=0x3fff5c000b80, result=0x3fff5c006908, setup_tables_done_option=0) at /home/stewart/mysql-5.7.5-m15/sql/sql_select.cc:97
#21 0x000000001056eee0 in execute_sqlcom_select (thd=thd@entry=0x3fff5c000b80, all_tables=0x3fff5c0060e8) at /home/stewart/mysql-5.7.5-m15/sql/sql_parse.cc:5028
#22 0x0000000010575824 in mysql_execute_command (thd=thd@entry=0x3fff5c000b80) at /home/stewart/mysql-5.7.5-m15/sql/sql_parse.cc:2488
#23 0x0000000010579a78 in mysql_parse (thd=0x3fff5c000b80, parser_state=<optimized out>) at /home/stewart/mysql-5.7.5-m15/sql/sql_parse.cc:5427
#24 0x000000001057aa64 in dispatch_command (command=COM_QUERY, thd=0x3fff5c000b80, packet=0x3fff5c009771 "SELECT * FROM test.t2 ORDER BY `a`,`data` INTO OUTFILE '/home/stewart/mysql-5.7.5-m15/build/mysql-test/var/mysqld.2/data//diff_table-slave-test-t2'", packet_length=147) at /home/stewart/mysql-5.7.5-m15/sql/sql_parse.cc:1250
#25 0x0000000010648f24 in handle_connection (arg=<optimized out>) at /home/stewart/mysql-5.7.5-m15/sql/conn_handler/connection_handler_per_thread.cc:298
#26 0x0000000010aea4d4 in pfs_spawn_thread (arg=<optimized out>) at /home/stewart/mysql-5.7.5-m15/storage/perfschema/pfs.cc:2137
#27 0x000000806315c328 in .start_thread () from /lib64/libpthread.so.0
#28 0x000000806304dc70 in .__clone () from /lib64/libc.so.6

The bit of code that gets hit is (buf0buf.cc, buf_block_align())

        ulint   counter = 0;
retry:  
        buf_pool_chunk_map_t*   chunk_map = buf_chunk_map_ref;

        if (ptr < reinterpret_cast<byte*>(srv_buf_pool_chunk_unit)) {
                it = chunk_map->upper_bound(0);
        } else {
                it = chunk_map->upper_bound(
                        ptr - srv_buf_pool_chunk_unit);
        }

        if (it == chunk_map->end()) {
                /* The block should always be found. */
                ++counter;
                ut_a(counter < 10);
                os_thread_sleep(100000); /* 0.1 sec */
                goto retry;
        }

I haven't root caused the exact cause of this crash, but there is a bug here around buffer pool resize.

A bunch of the status relies on volatile variables to denote status (and, indeed, to wait until buffer pool resize is done):

        while (buf_pool_resizing) {
                /* buf_pool_chunk_map is being modified */
                os_thread_sleep(100000); /* 0.1 sec */
        }

but volatile does *not* mean there is a memory barrier, it just means load/store instructions are generated. This may end up being all okay on Intel, but will have problems on POWER (likely also ARM).

The ptr < reinterpret_cast<byte*>(srv_buf_pool_chunk_unit) comparison looks strange too... comparing a pointer to less than 8MB (in MTR) is likely to never really be correct... unless there's something really subtle that i'm missing.

How to repeat:
./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10

Suggested fix:
1) If using global variables as synchronization points, use atomics, not volatile.

There are likely other places that are also buggy.

2) don't do a loop of 10 times just checking again, use something proper.

3) fix comparison of ptr to buffer polo size.
[13 Nov 2014 8:14] Yasufumi Kinoshita
You should explain based in exact. seems confused.

rpl.rpl_checksum_cache doesn't resize buffer pool.
buf_pool_resizing and buf_chunk_map_ref seem to be read correctly at least for this case.

The problem seems to be came from internal of chunk_map inconsistency.

I think this is a duplicate of bug#72809 which will be fixed at 5.7.6.
The fix added os_wmb for the head of os_thread_create_func() to be a general fix.
The new thread should read the parent thread writes correctly after the fix.

------------

Currently, buf_pool_resizing is intended always false at buf_block_align().
Because AHI is disabled during resizing buffer pool.
No problem for now.
[17 Nov 2014 6:25] Stewart Smith
main.preload test also fails with this crash.
[17 Nov 2014 6:55] Stewart Smith
and also causes innodb_explain_json_non_select_all and main.innodb_explain_json_non_select_none to fail.
[10 Apr 2015 7:53] Stewart Smith
So while the tests no longer fail and I don't think I'm seeing any problems directly related to this at the moment, this kind of code pattern screams broken code for weaker memory model machines and *will* come back to bite.

So it still needs to be fixed.

"grep -r volatile mysql/" is a great way to find bugs, as each occurrence is, in fact, a bug.
[23 Nov 2015 5:02] Alexey Kopytov
Bug #79378 is possibly related.
[26 Feb 2016 19:54] Marko Mäkelä
It seems that this could be a duplicate of
Bug#22228629 ASSERTION COUNTER < 10 IN BUF0BUF.CC LINE 3939 BUF_BLOCK_ALIGN()
which was only filed internally.

For what it is worth, the counter has been removed in two bug fixes:

Bug#22179317 AGAIN, SIGNAL 11 IN INNODB.INNODB_BUFFER_POOL_RESIZE_DEBUG
removed it from non-debug code, and

Bug#22709463 LATCHING ORDER VIOLATION IN BUF_BLOCK_ALIGN() DEBUG CALLS
removed the retry loop altogether. That fix should be included in MySQL 5.7.12. It will rename buf_block_align() to buf_block_from_ahi().

It would be interesting to see if you can repeat some problem after those bug fixes. Maybe the adaptive hash index simply gets corrupted? To make more use of adaptive hash index lookups for testing purposes, you could disable the optimistic cursor restoration code path in btr_pcur_restore_position().
[28 Feb 2016 15:55] Mark Callaghan
Parent bug to remove clever uses of os_thread_sleep -> https://bugs.mysql.com/bug.php?id=68588
[8 Aug 2018 12:35] Sinisa Milivojevic
Hi Stewart,

This bug was  filed back in 2014.

I have inspected fully the code that deals with blocks alignment in the buffer pool. That code has been changed beyond recognition, so I have to ask you whether you still experience the same problem ???

Thanks in advance .....
[9 Aug 2018 1:34] Stewart Smith
I don't *think* so, but I haven't tried for a while. Daniel is checking and can report on if there's anything currently the issue.
[13 Dec 2018 3:07] Bogdan Kecman
8.0.13 can't reproduce this any more

$ ./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10
Logging: ./mysql-test-run.pl  rpl.rpl_checksum_cache --repeat=10
MySQL Version 8.0.13
Checking supported features...
Using 'all' suites
Collecting tests...
 - adding combinations for rpl
Checking leftover processes...
Removing old var directory...
Creating var directory '/usr/share/mysql-test/var'...
Installing system database...
Using parallel: 1

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13019

[  3%] rpl.rpl_checksum_cache 'mix'             [ pass ]  638229
[  6%] rpl.rpl_checksum_cache 'mix'             [ pass ]  652377
[ 10%] rpl.rpl_checksum_cache 'mix'             [ pass ]  633237
[ 13%] rpl.rpl_checksum_cache 'mix'             [ pass ]  658663
[ 16%] rpl.rpl_checksum_cache 'mix'             [ pass ]  628226
[ 20%] rpl.rpl_checksum_cache 'mix'             [ pass ]  523359
[ 23%] rpl.rpl_checksum_cache 'mix'             [ pass ]  595514
[ 26%] rpl.rpl_checksum_cache 'mix'             [ pass ]  666427
[ 30%] rpl.rpl_checksum_cache 'mix'             [ pass ]  546627
[ 33%] rpl.rpl_checksum_cache 'mix'             [ pass ]  536807
[ 36%] rpl.rpl_checksum_cache 'row'             [ pass ]  469629
[ 40%] rpl.rpl_checksum_cache 'row'             [ pass ]  487164
[ 43%] rpl.rpl_checksum_cache 'row'             [ pass ]  513952
[ 46%] rpl.rpl_checksum_cache 'row'             [ pass ]  472401
[ 50%] rpl.rpl_checksum_cache 'row'             [ pass ]  511782
[ 53%] rpl.rpl_checksum_cache 'row'             [ pass ]  471359
[ 56%] rpl.rpl_checksum_cache 'row'             [ pass ]  439180
[ 60%] rpl.rpl_checksum_cache 'row'             [ pass ]  385448
[ 63%] rpl.rpl_checksum_cache 'row'             [ pass ]  379068
[ 66%] rpl.rpl_checksum_cache 'row'             [ pass ]  418425
[ 70%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  400672
[ 73%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  432468
[ 76%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  393953
[ 80%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  430561
[ 83%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  406154
[ 86%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  416141
[ 90%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  370486
[ 93%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  355843
[ 96%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  331957
[100%] rpl.rpl_checksum_cache 'stmt'            [ pass ]  361970
--------------------------------------------------------------------------
The servers were restarted 2 times
Spent 14528.079 of 15759 seconds executing testcases

Completed: All 30 tests were successful.