Description:
Certain subselects use a very high (and more than required) amount of diskspace.
In my example the original table has a varchar(255) field, and 1 million rows, each containing a string between 10 and 20 char only.
select count(column) from (select column from table) as x;
the inner table would be about 15 MB in size, and you would expect that the temp table created by mysql for the subquery data, would be the same size.
The temporary table will grow up to 255 MB.
I have looked at the actual data in the .MYD file in my temp table directory. I found that mysql stored each row as 255 bytes, filled with spaces.
Varchar do not normaly store trailing space, and as far as i can see, this happens only in temp tables
create table t2 select * from tables; # works fine
More example can be found in bug #21849
Since Galt seems to experience the same issue, this apears not limited to my setup. (Galt did also experience this with the hidden group by column)
How to repeat:
# ensure tamp tables go to disk; this will set the value to 16KB
set max_heap_table_size = 0;
#please use the snp table as in bug #21849
# or below filled with 1 million rows, name between 10 and 20 char
# use rand()
create table snp (name varchar(255));
#get a user lock on a 2nd mysql client
do get_lock('a',1);
# on the original mysql client
select count(*) from (select name from snp ) as x
UNION ALL select get_lock('a4',7200) ;
The union is only needed, to keep mysql waiting. (2 hours for the user lock).
This allows having a look at the data file.
using less, or similar tools, you will see that each row is filled up with spaces to take 255 char.
Suggested fix:
temporary tables, should not store trailing spaces for varchar fields.
I have not tested what will haappen with char, text or blob fields, or if functions are used to return strings.