Bug #56153 | DELETE + JOIN is extremely slow if one table has >2 GB of binary data | ||
---|---|---|---|
Submitted: | 20 Aug 2010 17:34 | Modified: | 23 Oct 2010 9:56 |
Reporter: | Craig Holmquist | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S5 (Performance) |
Version: | 5.0.87 | OS: | Windows (XP SP3 32-bit) |
Assigned to: | CPU Architecture: | Any |
[20 Aug 2010 17:34]
Craig Holmquist
[20 Aug 2010 17:35]
Craig Holmquist
Test script
Attachment: slowdelete.py (text/plain), 1.54 KiB.
[24 Aug 2010 7:49]
Valeriy Kravchuk
Please, send your my.ini file content and EXPLAIN results for the SELECT query in your environment.
[24 Aug 2010 13:22]
Craig Holmquist
INI file
Attachment: my.ini (application/octet-stream, text), 206 bytes.
[24 Aug 2010 13:23]
Craig Holmquist
EXPLAIN results for SELECT in script
Attachment: select-explain.csv (application/vnd.ms-excel, text), 270 bytes.
[24 Aug 2010 13:24]
Craig Holmquist
It seems like the value in the script I uploaded (450000) may not trigger the bug consistently. I tested again this morning (on the same system with the same conditions) but I had to move that value up to 750000 to trigger the bug.
[25 Aug 2010 3:51]
Valeriy Kravchuk
The following explain results for SELECT looks nice: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t2 index PRIMARY 4 5000 Using index 1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 testslowdelete.t2.mainid 1 Using where; Using index; Not exists Yes, entire primary key for table t2 is scanned, but "Using index" means that table's data (and this is MyISAM table) are NOT read from disk. With DELETE situation is, likely, different, and entire data are read (from disk). To check this assumption, please, send the results of EXPLAIN for the same SELECT, but with SELECT t2.* instead of just one indexed column.
[25 Aug 2010 12:50]
Craig Holmquist
EXPLAIN results for SELECT with t2.*
Attachment: select-explain-t2all.csv (application/vnd.ms-excel, text), 249 bytes.
[25 Aug 2010 13:27]
Valeriy Kravchuk
We see the following in EXPLAIN results: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t2 ALL 5000 1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 testslowdelete.t2.mainid 1 Using where; Using index; Not exists This is how DELETE is executed: all rows, entirely, are read from the t2 (big) table in this case. We can make an optimizer feature request to read only PK/indexed column(s) from the table you delete rows from if you do not really use any other columns from it in cases like this. Do you agree?
[25 Aug 2010 13:54]
Craig Holmquist
Are you sure that's the problem? If that were the case I would expect the performance to get progressively worse as the row size got larger, not to suddenly get much worse (1 s -> 90 s) once the row size reached a certain threshold. However, I would say that this is definitely something that should be optimized; there's no reason to read in an entire 750K row if you're only looking at a few bytes.
[23 Sep 2010 9:56]
Susanne Ebrecht
Are there keys on the table? Consider, when you delete a row you could do it restrict or cascade. Also it needs to get checked if keys got violated and so on? Could this be the problem?
[23 Oct 2010 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".