Bug #101154 err == DB_SUCCESS in in file btr0btr.cc line 2167 for internal temporary table
Submitted: 13 Oct 2020 13:42 Modified: 14 Oct 2020 13:47
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[13 Oct 2020 13:42] Sveta Smirnova
Description:
When MySQL creates InnoDB temporary table to store data, required to resolve a query and function btr_cur_pessimistic_insert fails, for example, due to out of space error, InnoDB crashes server with this assertion.

See also https://jira.percona.com/browse/PS-7318?focusedCommentId=268045&page=com.atlassian.jira.pl... and perfect explanation by Marcelo Altmann on why this happens.

How to repeat:
1. Apply PS-7318_test.diff to the latest Percona Server source tree
2. Join two archive parts, then gunzip. You will have a file, named PS-7318.test
3. Pull PS-7318.test into mysql-test/t directory
4. Run ./mtr --retry=0 --nowarnings --record --force --suite=main PS-7318
5. Observe the crash

Note, the test just imitates out of space error in btr_cur_pessimistic_insert, but not actually fills up InnoDB temporary space. In order to imitate it additional activity needed. For example, put DBUG_SYNC_WAIT(someID) into btr_cur_pessimistic_insert before calling fsp_reserve_free_extents and at the same time raise some another ID. In the test case create one more connection thread which will wait signal from btr_cur_pessimistic_insert and do something to fill up temporary tablespace. See https://dev.mysql.com/doc/internals/en/debug-sync-facility.html for details.

MySQL 8.0 has same code in btr_insert_on_non_leaf_level_func, but my test does not crash it.

Suggested fix:
When error in btr_cur_pessimistic_insert happens while InnoDB fills up temporary table it is not a catasrophic failure. It would be safe to just return an error to the client and continue operating.
[13 Oct 2020 13:44] Sveta Smirnova
Sorry, copied instructions from Percona bug report.

Read point 1 as: "Apply bug101154.diff to MySQL source code". Current commit I used "commit e18e2390b3f84a97f9d6cf53aff7d51b736faa4e"
[13 Oct 2020 13:45] Sveta Smirnova
Diff, which needs to be applied to MySQL source code. Tested with last commit "commit e18e2390b3f84a97f9d6cf53aff7d51b736faa4e"

Attachment: bug101154.diff (text/x-patch), 2.70 KiB.

[13 Oct 2020 13:48] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-aa (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 13:57] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-ab (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 14:01] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-ac (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 14:02] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-ad (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 14:03] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-ae (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 14:06] Sveta Smirnova
Test case, part

Attachment: PS-7318.test.gz.part-af (application/octet-stream, text), 2.50 MiB.

[13 Oct 2020 14:07] Sveta Smirnova
Test case, final part

Attachment: PS-7318.test.gz.part-ag (application/octet-stream, text), 1.89 MiB.

[13 Oct 2020 14:31] MySQL Verification Team
Hi Mrs. Smirnova,

Thank you for your bug report.

However, we do require a test case for our server, MySQL 8.0, latest release.

Please, also, let us know why should we patch our server , at all ???

Many thanks in advance.
[13 Oct 2020 15:49] Sveta Smirnova
Hi!

First, this test is for MySQL Server and is repeatable with MySQL Server. If you are not comfortable with the name, you can rename it to MySQL-101154.test and it will still perfectly crash MySQL Server.

Second, because the test case explores DBUG functionality, perfectly described in the official manual at https://dev.mysql.com/doc/internals/en/test-synchronization.html. This functionality is also used in the MySQL own tests, such as trigger_debug.
[14 Oct 2020 12:40] MySQL Verification Team
Hello Mrs. Smirnova,

We still have several questions.

First of all, you wrote:

"
MySQL 8.0 has same code in btr_insert_on_non_leaf_level_func, but my test does not crash it.
"

This implied that your test does not crash MySQL 8.0. Does it mean that it crashes ONLY MySQL 5.7.32 ???

Second, you wrote:

"
1. Apply PS-7318_test.diff to the latest Percona Server source tree
"

Does it mean that the patch must be applied to MySQL 5.7.31 in order for crash to be repeated ??? And if yes, why ??? Will your test case crash unpatched MySQL 5.7.31 ??? Have you noticed this behaviour in the earlier releases of MySQL 5.7 ???

Please, do note that if unpatched MySQL 5.7 is not crashed, then we can not verify this report.

Regarding the rest of your report, you have answered all of our questions ........ So, only the above ones remain ......

Many thanks in advance.
[14 Oct 2020 13:19] Sveta Smirnova
Dear Bugs Verification Team,

> "
> MySQL 8.0 has same code in btr_insert_on_non_leaf_level_func, but my
> test does not crash it.
> "
>
> This implied that your test does not crash MySQL 8.0. Does it mean that
> it crashes ONLY MySQL 5.7.32 ???

Yes.

> Second, you wrote:
>
> "
> 1. Apply PS-7318_test.diff to the latest Percona Server source tree
> "
>
> Does it mean that the patch must be applied to MySQL 5.7.31 in order for
crash to be repeated ??? 

Yes.

> And if yes, why ??? 

Because my test explores DBUG functionality which requires inserting custom DBUG_EXECUTE_IF in the code. Practically, it inserts an error in the btr_cur_pessimistic_insert which then is NOT checked in the function btr_insert_on_non_leaf_level_func as it should.

> Will your test case crash unpatched MySQL 5.7.31 ???

My test would not, but I if someone uses small value for innodb_temp_data_file_path, for example innodb_temp_data_file_path=ibtmp1:12M:50M they can easily hit this crash.

> Have you noticed this behaviour in the
earlier releases of MySQL 5.7 ???

I believe this code was introduced long time ago, before special tablespace for InnoDB was introduced. See a comment which my colleague Marcelo did for Percona bug (I provided the link earlier):

----<q>----
One part that caught my attention is https://github.com/percona/percona-server/blob/Percona-Server-5.7.25-28/storage/innobase/b... :

3540         if (!(flags & BTR_NO_UNDO_LOG_FLAG)
3541             || dict_table_is_intrinsic(index->table)) {
3542
3543                 ut_a(cursor->tree_height != ULINT_UNDEFINED);
3544
3545                 /* First reserve enough free space for the file segments
3546                 of the index tree, so that the insert will not fail because
3547                 of lack of space */
3548
3549                 ulint   n_extents = cursor->tree_height / 16 + 3;
3550
3551                 success = fsp_reserve_free_extents(&n_reserved, index->space,
3552                                                    n_extents, FSP_NORMAL, mtr);
3553                 if (!success) {
3554                         return(DB_OUT_OF_FILE_SPACE);
3555                 }
3556         } 

temporary table will return true for dict_table_is_intrinsic. Function fsp_reserve_free_extents will return true in case it can reserve space on the required tablespace or false otherwise. Here we are talking about the  ibtmp tablespace. By following the code path, this won't trigger any log message if returned from btr_insert_on_non_leaf_level_func->btr_cur_pessimistic_insert->fsp_reserve_free_extents because as soon as the control gets back to btr_insert_on_non_leaf_level_func the variable err won't be DB_SUCCESS and the code will assert.

By looking into their config file I can see they have:

innodb_temp_data_file_path          = ../tmp/ibtmp1:12M:autoextend:max:100G 

This limits their server-wise InnoDB Temporary tables (Implicit and explicit) to 100G.
This plus the fact of the massive query they are running and they confirmed that often it never completes makes me believe they are running out of space on the temporary tablespace.
----</q>----

>  Please, do note that if unpatched MySQL 5.7 is not crashed, then we can
not verify this report.

You can verify it even without any testing. Just read the code.

For versions when InnoDB did not have innodb_temp_data_file_path it worked fine, because btr_insert_on_non_leaf_level_func worked only with persistent InnoDB tables and if insert into them was due to the space failure this is the issue which DBA should solve manually.

However, if btr_insert_on_non_leaf_level_func fails for the internal temporary table, there is not harm to simply abort the query and clear already used space. This tables should've been destroyed after query finishes anyway. No need to crash the server.

And this code also exists in MySQL 8.0
[14 Oct 2020 13:47] MySQL Verification Team
Hi Mrs. Smirnova,

We have applied your patch and managed to repeat the problem.

Your code analysis is also correct.

Verified as reported.
[10 Feb 5:24] Satya Bodapati
I have a testcase that can crash debug server without any code changes.

--source include/have_innodb.inc

--let $restart_parameters = "restart: --innodb_temp_data_file_path=ibtmp1:12M:autoextend:max:1G --big-tables=1"
--source include/restart_mysqld.inc

SELECT @@innodb_temp_data_file_path;

drop function if exists func1;
delimiter |;
create function func1(x int) returns int deterministic
        begin
                declare z1, z2 int;
                set z1 = x;
                set z2 = z1 + 2;
                return z2;
        end|
delimiter ;|

create table t1 (a int, b varchar(20));
insert into t1 values(1, 'a'), (2, 'b'), (3, 'c'), (3, 'c'), (4, 'c');

SET GLOBAL innodb_limit_optimistic_insert_debug=4;
--let $i=1
while ($i <= 15) {
INSERT INTO t1 SELECT * FROM t1;
--inc $i
}

SELECT COUNT(*) FROM t1;

SET GLOBAL innodb_limit_optimistic_insert_debug=2;
select * from t1 order by func1(a);
[10 Feb 5:26] Satya Bodapati
ideally, we can crash the release build too, albeit we need more data to insert.

So any 5.7 instance with ibtmp1 limiting to some max value can be easily crashed
[10 Feb 12:59] MySQL Verification Team
Thank you for the feedback.

This is still 5.7-only bug.

However, if you could provide us with a test case that crashes a server, that could influence the priority of this report.
[11 Feb 4:21] Satya Bodapati
Right, this is 5.7 only bug. This bug is applicable for other scenarios as well

1. Any non extending tablespaces (ie. with max limit like ibdata, ibtmp1)
2. For regular tablespaces when the disk is about to be full scenarios

Coming to the testcase, isn't the above testcase that crashes debug build enough?
I just used innodb_limit_optimistic_insert_debug to reduce the amount of inserts.
[11 Feb 13:08] MySQL Verification Team
Satya,

Thank you for the feedback.