Bug #3402 Extremely bad order by performance on 'text' fields
Submitted: 6 Apr 2004 16:17 Modified: 8 Apr 2004 8:07
Reporter: Peter van Dijk Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.0.18 OS:Any (any)
Assigned to: Dean Ellis CPU Architecture:Any

[6 Apr 2004 16:17] Peter van Dijk
Description:
Ordering large numbers of rows based on a 'text' field will use up huge amounts of disk space and processing time to order irrespective of the amount of data being ordered.
For example,
If you create a table with 2 columns, a varchar(8) and a text, then insert 500,000 rows of ('12345678','12345678'),
doing a select query on the table and ordered on the varchar column will take under 5 seconds, but ordering on the text column will take several minutes (and, in the process, use up heaps of disk space)

While, in a proper database, i would never use a text field for only 8 characters, I've been having problems doing similar operations on text fields with about 1000 or so characters. I dont expect blazing fast speed from an order by clause in these cases, but using 2gig+ of temporary space to order 20mb of data is a little extreme.

How to repeat:
# set up the table:
create table test (id int not null default 0, testvarchar varchar(8) not null default '', testtext text not null default '');

insert into test values(1,'12345678','12345678');
# repeat 499,999 times or so, my favourite way is insert into test select * from test

# now query it:

select * from test order by testvarchar limit 100000,10;
# 10 rows in set (2.01 sec)

select * from test order by testtext limit 100000,10;
# 10 rows in set (1 min 21.39 sec)

Suggested fix:
Not sure. The ordering routines just dont seem to cope with this sort of thing at all.
[8 Apr 2004 8:07] Dean Ellis
mysqld does not know in advance that your TEXT data is actually quite short, so it uses the value of your max_sort_length server parameter for sorting TEXT columns.  If you know that your data is significantly shorter than this (the default is 1024 bytes, per TEXT column being sorted, per row) then you can set this to a smaller value.