MySQL Bugs: #11824: internal /tmp/*.{MYD,MYI} files remain, causing subsequent queries to fail

Bug #11824	internal /tmp/*.{MYD,MYI} files remain, causing subsequent queries to fail
Submitted:	8 Jul 2005 19:46	Modified:	13 Jul 2006 4:13
Reporter:	Nicholas Leippe	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S1 (Critical)
Version:	4.1.11,4.1.12/BK source	OS:	Linux (Gentoo Linux)
Assigned to:	Ingo Strüwing	CPU Architecture:	Any

Description:
At seemingly random points in time, tmp table files will not be properly removed, causing many (or all?) subsequent queries to fail.

Manually deleting the files restores normal operation w/o even restarting mysqld.

There are always a pair of files like:

/tmp/#sql_668d_0.MYD
/tmp/#sql_668d_0.MYI

In at least one case, it broke again with the exact same filenames as were previously deleted.

This has occurred on all of:

1) x86 Fedora Core 1 (kernel 2.4.21-2115.nptlsmp, quad p3 xeon)
  mysql 4.1.11-max and 4.1.12-max official binaries 
2) em64t Gentoo (kernel 2.4.6-11-gentoo-r6, dual p4 xeon em64t w/ht turned on) 
  mysql 4.1.12-max built from sources using the compile-pentium-max script
  (we compiled to get nptl)
  gcc --version:
  gcc (GCC) 3.4.3 20041125 (Gentoo Linux 3.4.3-r1, ssp-3.4.3-0, pie-8.7.7)

No entries are generated in the error log when this occurs.

How to repeat:
I have not been able to isolate a specific query that triggers this, thus I cannot reproduce this at will.
But, it has occurred often enough in the last month to become of grave concern to us.

Hi,

Any special things like multi tmpdir setup, CREATE TABLE with a SELECT statement?
Is this on a Slave/Master setup?
Can you use the general or binary log to find the query using the timestamp?

Thanks,

Geert

Could you also elaborate on "causing subsequent queries to fail" ?
How do they fail ?

We can now reproduce this behavior at will.  I have uploaded a file "repro.tgz" which contains three files:

tables.sql -- dump file creates two tables:
- contacts_copy
- descriptions_copy

querytest(badkeyfileerror).sql - sql file w/query that generates a "bad keyfile" error

querytest(outofmemory).sql - sql file w/query that generates an "out of memory" error

The number of entries (unique) in the ORDER BY clause changes the behavior.
When there are 8737 or fewer entries, the query successfully completes.
When there are more than 8737 but fewer than 9119 entries it produces an "OUT OF MEMORY" error, and no temp files are left laying around.
When there are more than 9119 it produces an "INCORRECT KEYFILE FOR TABLE" error.

Additionally, changing the number of columns selected (from the original query) or tables joined also affects which of the above two errors are produced.

Repro:
1) create the tables
2) run the query from the "querytest(badkeyfilerror).sql" file -- it should break
    and leave orphaned tmp table files in /tmp/
3) remove orphaned tmp table files from /tmp/
4) run the query from the "querytest(outofmemory).sql" file -- it should break
   but w/o leaving orphaned tmp table files in /tmp/
5) remove the last dozen ORDER BY entries and rerun the query -- it should
    now succeed

I realize this may be dependent on buffer settings--but am not sure which ones, so I've uploaded vars_and_stats.tgz which contains the result from SHOW VARIABLES and SHOW STATUS.

I was able to repeat with your last instructions, however in the step
5 I needed to remove not a dozen of entries but around 3000 lines:

miguel@hegel:~/dbs/4.1> bin/mysql -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 4.1.15-debug-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create database ty;
Query OK, 1 row affected (0.02 sec)

mysql> use ty
Database changed
mysql> source /home/miguel/dbs/tables.sql

mysql> show tables;
+-------------------+
| Tables_in_ty      |
+-------------------+
| contacts_copy     |
| descriptions_copy |
+-------------------+
2 rows in set (0.00 sec)

mysql> source /home/miguel/dbs/badquery.sql                 
ERROR 1034 (HY000): Incorrect key file for table ''; try to repair it

miguel@hegel:~> ls /tmp/#sql*
/tmp/#sql_60fb_0.MYD  /tmp/#sql_60fb_0.MYI

mysql> source /home/miguel/dbs/outmem.sql
ERROR 1037 (HY001): Out of memory; restart server and try again (needed 524252 bytes)

editing the outmem.sql removing lines:

33 rows in set (0.61 sec)

I renamed the files provided with the names above.
My OS is Suse 9.3.

We've run into this problem too with version 14.12 Distrib 5.0.18, for Win32 (ia32).  I increased various buffer limits to try to work around the problem, to no avail.  As a temporary work around, I have a batch file that runs once an hour deleting any extraneous files that are lying around.  This is obviously not acceptable for a production environment but it's all we can do for now.

badquery.sql tries to create a temporary table with 28772 columns. This is not really supported. When opening a table it is checked for an index file header with a maximum of 64KB header size. Every column takes 7 bytes in the header.

I'll look and see if I can either make that much columns work, or give a better error message (it may however not be possible to make new error messages in 4.1 and 5.0). And I'll try to remove the temporary files anyway.

So, to complete the order by clause, MySQL creates one column in the temp filesort table for each order by part?
Is there no more efficient way to sort?

In the meantime, I already reworked our application to create a temporary table with these sort key values and an ordered, indexed id.  We join it and sort on it's id field, and this problem goes away entirely.  Could MySQL perhaps do the same internally?

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7580

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8300

Pushed to mysql-5.0-engines.

Pushed to mysql-5.1-engines.

Very complex select statements can create temporary tables
  that are too big to be represented as a MyISAM table.
  
  This was not checked at table creation time, but only at
  open time. The result was an attempt to delete the 
  "impossible" table.
  
  But if the server is built --with-raid, MyISAM tries to 
  open the table before deleting the files. It needs to find 
  out if the table uses the raid support and how many raid 
  chunks there are. This is done with an open "for repair",
  which will almost always succeed.
  
  But in this case we have an "impossible" table. The open
  failed. Hence the files were not deleted. Also the error
  message was a bit unspecific.
  
  I turned an open error in this situation into the assumption 
  of having no raid support on the table. Thus the normal data 
  file is tried to be deleted. This may however leave existing 
  raid chunks behind.
  
  I also added a check in mi_create() to prevent the creation
  of an "impossible" table. A more decriptive error message is
  given in this case.

Pushed to 5.1.12 and 5.0.24 and 4.1.21.

Noted in 4.1.21, 5.0.25 (not 5.0.24), 5.1.12 changelogs.

For very complex SELECT statements could create temporary tables that
were too big, but for which the temporary files did not get removed,
causing subsequent queries to fail.

The same behavior still occurs in the version after apply this patch.
# But I cannot find procedures to reproduce it.

Is there any related bug about this?

Not that I know of.

Can you please be more specific, which version you applied this patch on?

And what do you mean with the procedures?