Bug #115983 Implicit internal temp table reports ’The table '/tmp/#sql_XXX'‘ is full;
Submitted: 3 Sep 2024 2:27 Modified: 3 Sep 2024 8:06
Reporter: Fan Lyu Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7.44 OS:Any
Assigned to: CPU Architecture:x86

[3 Sep 2024 2:27] Fan Lyu
Description:

Hi, by a complicated subquery, we sometimes met such problems: 
The Table ‘/tmp/#XXX’ is full
error code is ER_RECORD_FILE_FULL

query like:
SELECT Table_A.colA,colB,colC... FROM TableA
INNER JOIN TABLE_B on xx condtion,
LEFT JOIN TABLE_C on yy condtion,
(...still 10 left joins)
where TABLEA.colA in (2,3,4) and TABLE_B.someCol = YYY and TABLE_C.someCol=ZZZ
GROUP BY TABLE_A.someCol

explain its execution plan like:(the XXXs are joined tables, I think most important is extra in TABLE_A)

select_type	table	type	possible_keys	key	key_len	rows	filtered	Extra
								
SIMPLE	Table_A	range	PRIMARY	PRIMARY	4	5	10	Using where; Using temporary; Using filesort
SIMPLE	XXX	eq_ref	PRIMARY,XXX	PRIMARY	4	1	24.59	Using where
SIMPLE	XXX	eq_ref	PRIMARY,XXX	PRIMARY	6	1	100	Using index
SIMPLE	XXX	ref	XXX	XXX	4	1	10	Using where
SIMPLE	XXX	eq_ref		XXX	8	1	10	Using where
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	19	Using where
SIMPLE	XXX	ref	XXX	XXX	8	1	100	Using where
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	ref	XXX	XXX	4	4	100	Using where
SIMPLE	XXX	eq_ref	XXX	PRIMARY	4	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)
SIMPLE	XXX	eq_ref	XXX	XXX	8	1	100	(Null)

Seems to be implicit temp table generated by SQL. EXPLAIN output includes 'using temporary'

Also, we ran same SQL on a percona 5.7.39 by percona community many times, but did not happen. EXPLAIN output is same.
By percona, I saw such fixes, seems to be related to this error. This bugfix changes return condition of HA_ERR_RECORD_FILE_FULL in heap engine write.
https://github.com/percona/percona-server/commit/9c6fec530f09ab22267fa8eed67aaf23a92407d9
https://perconadev.atlassian.net/browse/PS-1469

What we have tried but not relevant:
1. There is a very similar problem Bug#31117893 fixed in 8.0.27  . Our query also have GROUP BY. But by 8.0, this bug only occurs when internal mem engine=TempTable, not happen when it is MEMORY. But by 5.7, only MEMORY is available for internal implicit temp table, so I wonder whether our issue by 5.7 is similar to known Bug#31117893

2. Disk has 250G remaining, and when client returns this error, there no INNODB error in errorlog, so I wonder it is not converted to  INNODB disk table, directly returns error in sql module. We think this is not caused by insufficient disk volume.

3. both IBDATA1 and IBTMP1 has NO MAX in innodb_temp_data_file_path and  innodb_data_file_path.(Though ibtmp is 300+GB, there is no limitation on 64-bit OS and its file size configuration).So we think this is also not caused by limitation of 'innodb_temp_data_file_path' and OS.
innodb_temp_data_file_path=ibtmp1:200M:autoextend 

variable comparison: 
by community version, tmp_table_size=128MB, max_heap_table_size=64MB
by perocna, both is 16M 

we are wondering, whether there is some implicit internal temp table, which is created by optimizer/executor automatically, but forget to call  'create_ondisk_from_heap' , when it reaches the limitation of heap engine , it is not converted to a disk table properly. 

I am aware that 5.7.44 is the final version of 5.7, but could you please still help us to identify, whether this is a known problem or it is comparible with some fixed issues in 8.X/9.0? sincere thanks!

How to repeat:
running long SQL parallelly, but does't happen very frequently.
Also, running a single SQL without multi-thread will noy reproduce.

Suggested fix:
No Idea. Since InnoDB does't have any reports, we are wondering

1.Memory(Heap) Engine's mechanism

2.When Memory(Heap) returns HA_ERR_RECORD_FILE_FULL, SQL module does not properly deal with 'create_ondisk_from_heap'.
[3 Sep 2024 7:29] MySQL Verification Team
Hello Fan Lyu,

Thank you for taking the time to report a problem. 
Please note that per Oracle's Lifetime Support policy, as of October 25, 2023, MySQL 5.7 is covered under Oracle Sustaining Support. Also, we don't fix bugs in old versions, don't backport bug fixes, so you need to check with latest MySQL Server 8.0/8.4/9.0 version. Thus, users are encouraged to upgrade to MySQL 8.0/8.4/9.0. - More details at https://www.mysql.com/support/eol-notice.html

I didn't check with your table(need subset of data to check at my end) but most likely your issue is fixed after Bug #99100. Any chance you can check if 8.0.39 fixed your issue? Thank you. 

If you are able to reproduce the bug with one of the latest versions 8.0/8.4/9.0.1, please change the version on this bug report to the version you tested and change the status back to "Open".  Again, thank you for your continued support of MySQL.

regards,
Umesh
[3 Sep 2024 8:02] Fan Lyu
Thank you for your apply.
This exactly does not happen in latest 8.0.X version.

But in production, stability is most important, a migration across 5.7 to 8.0 is risky.
I am also not asking for a bugfix or further maintainance for 5.7. We can modify max_heap_table_size to avoid this problem by ourselves.

We are just curious about the root cause :) 
In the mentioned bugfix in 8.0.27, TemptableAggregateIterator::move_table_to_disk is added in function TemptableAggregateIterator<Profiler>::Init.

But in 5.7, there is no function with same name.
Then I searched the surrounding codes, I found  "update_tmptable_sum_func" is only called by aggregator related functions.

I am wondering, if 
static enum_nested_loop_state end_update is comparable to TemptableAggregateIterator<Profiler>::Init ? if yes, then 5.7 is lacking move_to_disk logic?
[3 Sep 2024 8:06] Fan Lyu
the mention end_update function is in sql_executor.cc of 5.7 :)