MySQL Bugs: #56153: DELETE + JOIN is extremely slow if one table has >2 GB of binary data

Bug #56153	DELETE + JOIN is extremely slow if one table has >2 GB of binary data
Submitted:	20 Aug 2010 17:34	Modified:	23 Oct 2010 9:56
Reporter:	Craig Holmquist	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server	Severity:	S5 (Performance)
Version:	5.0.87	OS:	Windows (XP SP3 32-bit)
Assigned to:		CPU Architecture:	Any

Description:
The attached script creates two tables, one with a single primary key INT, the other with a primary key INT and a LONGBLOB.  After inserting data into both tables, it runs the following operations:

- A SELECT that joins the two tables
- A DELETE on the second table only
- A DELETE that joins the two tables

The SELECT and the first delete are fast; the second DELETE is extremely slow.  I'm seeing the first two statements execute nearly instantaneously, while the third takes 50s or longer.  The SQL is such that none of these statements actually match any rows.

It seems like performance degrades severely once the table crosses 2 GB of data:

1.5 GB - 0.8 s
Just under 2 GB - 1.0 s
Just over 2 GB - 94 s

In contrast, the initial insertion seems to scale linearly with the amount of data.

The file system used in this test was NTFS.  The CPU was a Core 2 Duo E6850.  The tables are MyISAM tables.

How to repeat:
Use the attached python script.

Suggested fix:
Ideally, a DELETE statement that doesn't match any rows should have similar performance characteristics to an equivalent SELECT, regardless of the table size.

Test script

Attachment: slowdelete.py (text/plain), 1.54 KiB.

Please, send your my.ini file content and EXPLAIN results for the SELECT query in your environment.

INI file

Attachment: my.ini (application/octet-stream, text), 206 bytes.

EXPLAIN results for SELECT in script

Attachment: select-explain.csv (application/vnd.ms-excel, text), 270 bytes.

It seems like the value in the script I uploaded (450000) may not trigger the bug consistently.  I tested again this morning (on the same system with the same conditions) but I had to move that value up to 750000 to trigger the bug.

The following explain results for SELECT looks nice: 

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t2	index		PRIMARY	4		5000	Using index
1	SIMPLE	t1	eq_ref	PRIMARY	PRIMARY	4	testslowdelete.t2.mainid	1	Using where; Using index; Not exists

Yes, entire primary key for table t2 is scanned, but "Using index" means that table's data (and this is MyISAM table) are NOT read from disk. With DELETE situation is, likely, different, and entire data are read (from disk). To check this assumption, please, send the results of EXPLAIN for the same SELECT, but with SELECT t2.* instead of just one indexed column.

EXPLAIN results for SELECT with t2.*

Attachment: select-explain-t2all.csv (application/vnd.ms-excel, text), 249 bytes.

We see the following in EXPLAIN results:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t2	ALL					5000	
1	SIMPLE	t1	eq_ref	PRIMARY	PRIMARY	4	testslowdelete.t2.mainid	1	Using where; Using index; Not exists

This is how DELETE is executed: all rows, entirely, are read from the t2 (big) table in this case.

We can make an optimizer feature request to read only PK/indexed column(s) from the table you delete rows from if you do not really use any other columns from it in cases like this. Do you agree?

Are you sure that's the problem?  If that were the case I would expect the performance to get progressively worse as the row size got larger, not to suddenly get much worse (1 s -> 90 s) once the row size reached a certain threshold.

However, I would say that this is definitely something that should be optimized; there's no reason to read in an entire 750K row if you're only looking at a few bytes.

Are there keys on the table?

Consider, when you delete a row you could do it restrict or cascade.

Also it needs to get checked if keys got violated and so on?

Could this be the problem?

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".