MySQL Bugs: #115983: Implicit internal temp table reports ’The table '/tmp/#sql

Bug #115983	Implicit internal temp table reports ’The table '/tmp/#sql_XXX'‘ is full;
Submitted:	3 Sep 2024 2:27	Modified:	3 Sep 2024 8:06
Reporter:	Fan Lyu	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.7.44	OS:	Any
Assigned to:		CPU Architecture:	x86

Description:

Hi, by a complicated subquery, we sometimes met such problems:
The Table ‘/tmp/#XXX’ is full
error code is ER_RECORD_FILE_FULL

query like:
SELECT Table_A.colA,colB,colC... FROM TableA
INNER JOIN TABLE_B on xx condtion,
LEFT JOIN TABLE_C on yy condtion,
（...still 10 left joins）
where TABLEA.colA in (2，3，4) and TABLE_B.someCol = YYY and TABLE_C.someCol=ZZZ
GROUP BY TABLE_A.someCol

explain its execution plan like:(the XXXs are joined tables, I think most important is extra in TABLE_A)

select_type table type possible_keys key key_len rows filtered Extra

SIMPLE Table_A range PRIMARY PRIMARY 4 5 10 Using where; Using temporary; Using filesort
SIMPLE XXX eq_ref PRIMARY,XXX PRIMARY 4 1 24.59 Using where
SIMPLE XXX eq_ref PRIMARY,XXX PRIMARY 6 1 100 Using index
SIMPLE XXX ref XXX XXX 4 1 10 Using where
SIMPLE XXX eq_ref XXX 8 1 10 Using where
SIMPLE XXX eq_ref XXX XXX 8 1 19 Using where
SIMPLE XXX ref XXX XXX 8 1 100 Using where
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX ref XXX XXX 4 4 100 Using where
SIMPLE XXX eq_ref XXX PRIMARY 4 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)
SIMPLE XXX eq_ref XXX XXX 8 1 100 (Null)

Seems to be implicit temp table generated by SQL. EXPLAIN output includes 'using temporary'

Also, we ran same SQL on a percona 5.7.39 by percona community many times, but did not happen. EXPLAIN output is same.
By percona, I saw such fixes, seems to be related to this error. This bugfix changes return condition of HA_ERR_RECORD_FILE_FULL in heap engine write.
https://github.com/percona/percona-server/commit/9c6fec530f09ab22267fa8eed67aaf23a92407d9
https://perconadev.atlassian.net/browse/PS-1469

What we have tried but not relevant:
1. There is a very similar problem Bug#31117893 fixed in 8.0.27 . Our query also have GROUP BY. But by 8.0, this bug only occurs when internal mem engine=TempTable, not happen when it is MEMORY. But by 5.7, only MEMORY is available for internal implicit temp table, so I wonder whether our issue by 5.7 is similar to known Bug#31117893

2. Disk has 250G remaining, and when client returns this error, there no INNODB error in errorlog, so I wonder it is not converted to INNODB disk table, directly returns error in sql module. We think this is not caused by insufficient disk volume.

3. both IBDATA1 and IBTMP1 has NO MAX in innodb_temp_data_file_path and innodb_data_file_path.(Though ibtmp is 300+GB, there is no limitation on 64-bit OS and its file size configuration).So we think this is also not caused by limitation of 'innodb_temp_data_file_path' and OS.
innodb_temp_data_file_path=ibtmp1:200M:autoextend

variable comparison:
by community version, tmp_table_size=128MB, max_heap_table_size=64MB
by perocna, both is 16M

we are wondering, whether there is some implicit internal temp table, which is created by optimizer/executor automatically， but forget to call 'create_ondisk_from_heap' , when it reaches the limitation of heap engine , it is not converted to a disk table properly.

I am aware that 5.7.44 is the final version of 5.7, but could you please still help us to identify, whether this is a known problem or it is comparible with some fixed issues in 8.X/9.0？ sincere thanks!

How to repeat:
running long SQL parallelly, but does't happen very frequently.
Also, running a single SQL without multi-thread will noy reproduce.

Suggested fix:
No Idea. Since InnoDB does't have any reports, we are wondering

1.Memory(Heap) Engine's mechanism

2.When Memory(Heap) returns HA_ERR_RECORD_FILE_FULL, SQL module does not properly deal with 'create_ondisk_from_heap'.

Hello Fan Lyu,

Thank you for taking the time to report a problem. 
Please note that per Oracle's Lifetime Support policy, as of October 25, 2023, MySQL 5.7 is covered under Oracle Sustaining Support. Also, we don't fix bugs in old versions, don't backport bug fixes, so you need to check with latest MySQL Server 8.0/8.4/9.0 version. Thus, users are encouraged to upgrade to MySQL 8.0/8.4/9.0. - More details at https://www.mysql.com/support/eol-notice.html

I didn't check with your table(need subset of data to check at my end) but most likely your issue is fixed after Bug #99100. Any chance you can check if 8.0.39 fixed your issue? Thank you. 

If you are able to reproduce the bug with one of the latest versions 8.0/8.4/9.0.1, please change the version on this bug report to the version you tested and change the status back to "Open".  Again, thank you for your continued support of MySQL.

regards,
Umesh

Thank you for your apply.
This exactly does not happen in latest 8.0.X version.

But in production, stability is most important, a migration across 5.7 to 8.0 is risky.
I am also not asking for a bugfix or further maintainance for 5.7. We can modify max_heap_table_size to avoid this problem by ourselves.

We are just curious about the root cause :) 
In the mentioned bugfix in 8.0.27, TemptableAggregateIterator::move_table_to_disk is added in function TemptableAggregateIterator<Profiler>::Init.

But in 5.7, there is no function with same name.
Then I searched the surrounding codes, I found  "update_tmptable_sum_func" is only called by aggregator related functions.

I am wondering, if 
static enum_nested_loop_state end_update is comparable to TemptableAggregateIterator<Profiler>::Init ? if yes, then 5.7 is lacking move_to_disk logic?

the mention end_update function is in sql_executor.cc of 5.7 :)