Bug #74775 | buf_block_align relies on random timeouts, volatile rather than barriers | ||
---|---|---|---|
Submitted: | 11 Nov 2014 4:42 | Modified: | 13 Dec 2018 3:07 |
Reporter: | Stewart Smith | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.7.7 | OS: | Linux |
Assigned to: | MySQL Verification Team | CPU Architecture: | ARM |
[11 Nov 2014 4:42]
Stewart Smith
[13 Nov 2014 8:14]
Yasufumi Kinoshita
You should explain based in exact. seems confused. rpl.rpl_checksum_cache doesn't resize buffer pool. buf_pool_resizing and buf_chunk_map_ref seem to be read correctly at least for this case. The problem seems to be came from internal of chunk_map inconsistency. I think this is a duplicate of bug#72809 which will be fixed at 5.7.6. The fix added os_wmb for the head of os_thread_create_func() to be a general fix. The new thread should read the parent thread writes correctly after the fix. ------------ Currently, buf_pool_resizing is intended always false at buf_block_align(). Because AHI is disabled during resizing buffer pool. No problem for now.
[17 Nov 2014 6:25]
Stewart Smith
main.preload test also fails with this crash.
[17 Nov 2014 6:55]
Stewart Smith
and also causes innodb_explain_json_non_select_all and main.innodb_explain_json_non_select_none to fail.
[10 Apr 2015 7:53]
Stewart Smith
So while the tests no longer fail and I don't think I'm seeing any problems directly related to this at the moment, this kind of code pattern screams broken code for weaker memory model machines and *will* come back to bite. So it still needs to be fixed. "grep -r volatile mysql/" is a great way to find bugs, as each occurrence is, in fact, a bug.
[23 Nov 2015 5:02]
Alexey Kopytov
Bug #79378 is possibly related.
[26 Feb 2016 19:54]
Marko Mäkelä
It seems that this could be a duplicate of Bug#22228629 ASSERTION COUNTER < 10 IN BUF0BUF.CC LINE 3939 BUF_BLOCK_ALIGN() which was only filed internally. For what it is worth, the counter has been removed in two bug fixes: Bug#22179317 AGAIN, SIGNAL 11 IN INNODB.INNODB_BUFFER_POOL_RESIZE_DEBUG removed it from non-debug code, and Bug#22709463 LATCHING ORDER VIOLATION IN BUF_BLOCK_ALIGN() DEBUG CALLS removed the retry loop altogether. That fix should be included in MySQL 5.7.12. It will rename buf_block_align() to buf_block_from_ahi(). It would be interesting to see if you can repeat some problem after those bug fixes. Maybe the adaptive hash index simply gets corrupted? To make more use of adaptive hash index lookups for testing purposes, you could disable the optimistic cursor restoration code path in btr_pcur_restore_position().
[28 Feb 2016 15:55]
Mark Callaghan
Parent bug to remove clever uses of os_thread_sleep -> https://bugs.mysql.com/bug.php?id=68588
[8 Aug 2018 12:35]
MySQL Verification Team
Hi Stewart, This bug was filed back in 2014. I have inspected fully the code that deals with blocks alignment in the buffer pool. That code has been changed beyond recognition, so I have to ask you whether you still experience the same problem ??? Thanks in advance .....
[9 Aug 2018 1:34]
Stewart Smith
I don't *think* so, but I haven't tried for a while. Daniel is checking and can report on if there's anything currently the issue.
[13 Dec 2018 3:07]
MySQL Verification Team
8.0.13 can't reproduce this any more $ ./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10 Logging: ./mysql-test-run.pl rpl.rpl_checksum_cache --repeat=10 MySQL Version 8.0.13 Checking supported features... Using 'all' suites Collecting tests... - adding combinations for rpl Checking leftover processes... Removing old var directory... Creating var directory '/usr/share/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13019 [ 3%] rpl.rpl_checksum_cache 'mix' [ pass ] 638229 [ 6%] rpl.rpl_checksum_cache 'mix' [ pass ] 652377 [ 10%] rpl.rpl_checksum_cache 'mix' [ pass ] 633237 [ 13%] rpl.rpl_checksum_cache 'mix' [ pass ] 658663 [ 16%] rpl.rpl_checksum_cache 'mix' [ pass ] 628226 [ 20%] rpl.rpl_checksum_cache 'mix' [ pass ] 523359 [ 23%] rpl.rpl_checksum_cache 'mix' [ pass ] 595514 [ 26%] rpl.rpl_checksum_cache 'mix' [ pass ] 666427 [ 30%] rpl.rpl_checksum_cache 'mix' [ pass ] 546627 [ 33%] rpl.rpl_checksum_cache 'mix' [ pass ] 536807 [ 36%] rpl.rpl_checksum_cache 'row' [ pass ] 469629 [ 40%] rpl.rpl_checksum_cache 'row' [ pass ] 487164 [ 43%] rpl.rpl_checksum_cache 'row' [ pass ] 513952 [ 46%] rpl.rpl_checksum_cache 'row' [ pass ] 472401 [ 50%] rpl.rpl_checksum_cache 'row' [ pass ] 511782 [ 53%] rpl.rpl_checksum_cache 'row' [ pass ] 471359 [ 56%] rpl.rpl_checksum_cache 'row' [ pass ] 439180 [ 60%] rpl.rpl_checksum_cache 'row' [ pass ] 385448 [ 63%] rpl.rpl_checksum_cache 'row' [ pass ] 379068 [ 66%] rpl.rpl_checksum_cache 'row' [ pass ] 418425 [ 70%] rpl.rpl_checksum_cache 'stmt' [ pass ] 400672 [ 73%] rpl.rpl_checksum_cache 'stmt' [ pass ] 432468 [ 76%] rpl.rpl_checksum_cache 'stmt' [ pass ] 393953 [ 80%] rpl.rpl_checksum_cache 'stmt' [ pass ] 430561 [ 83%] rpl.rpl_checksum_cache 'stmt' [ pass ] 406154 [ 86%] rpl.rpl_checksum_cache 'stmt' [ pass ] 416141 [ 90%] rpl.rpl_checksum_cache 'stmt' [ pass ] 370486 [ 93%] rpl.rpl_checksum_cache 'stmt' [ pass ] 355843 [ 96%] rpl.rpl_checksum_cache 'stmt' [ pass ] 331957 [100%] rpl.rpl_checksum_cache 'stmt' [ pass ] 361970 -------------------------------------------------------------------------- The servers were restarted 2 times Spent 14528.079 of 15759 seconds executing testcases Completed: All 30 tests were successful.