Bug #45428 slow queries should be fast
Submitted: 10 Jun 2009 11:25 Modified: 27 Dec 2012 19:28
Reporter: d di (Basic Quality Contributor) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S4 (Feature request)
Version:5.0.67 OS:Linux
Assigned to: CPU Architecture:Any
Tags: qc

[10 Jun 2009 11:25] d di
Description:
On a table with this definition (anonymized):

CREATE TABLE `cm_un` (
   `1_ui32`   int unsigned NOT NULL auto_increment,
   `2_dt`     datetime NOT NULL,
   `3_ui16`   smallint unsigned default NULL,
   `4_si16`   smallint default NULL,
   `5_ui16`   smallint unsigned default NULL,
   `6_vc255`  varchar(255) collate utf8_general_ci default NULL,
   `7_mt`     mediumtext collate utf8_general_ci,
   `8_vc255`  varchar(255) collate utf8_general_ci default NULL,
   `9_ui16`   smallint unsigned NOT NULL default '0',
  `10_ui8`    tinyint unsigned NOT NULL default '0',
  `11_lt`     longtext collate utf8_general_ci NOT NULL,
  `12_ui8`    tinyint unsigned NOT NULL default '0',
  `13_t`      text collate utf8_danish_ci,
  PRIMARY KEY  (`1_ui32`)
) ENGINE=MyISAM

containing 1 GiB of data in 16.000 rows, where 99.5% of the data is in the 13_t LOB column,

Running this query:

SELECT
  `1_ui32`,
  `2_dt`,
FROM
 `cm_un`
WHERE
 `12_ui8` = '0'

from a cold cache, and with all 16.000 rows having field values in the 12_ui8 column ranging from 1-5, the query returns no rows as it should, but:

The execution time is a whopping 10 seconds.  This is on a 100 MiB/s SAN, meaning that the entire 1 GiB table is spooled to RAM by mysqld, as opposed to just the main row data with columns 1 through 12 (50 MiB) and the index (165 KiB).  The table has been optimized with OPTIMIZE TABLE before running the query.

Under optimal conditions, only the primary key index (containing 1_ui32) and all of the row data (containing 2_dt and 12_ui8) should be brought into the working set, there is no need to put the BLOB columns (already stored separately in the MYD file) into RAM.

The table was designed based on the notion that MySQL Server would execute the above query in an optimal fashion.  Is there a good reason why this is not the case?

Or is this just a bug of some sort ;-)

How to repeat:
1) Create table (see definition above).
2) Use a filler script to fill the table with random test data, 16.000 rows with approximately 1 GiB of data in the 13_t column and nothing of significance in other columns.
3) Run the SELECT query (see above), with a clause that selects no rows.
4) Measure response time with cold caches (or use other debugging means to see what's pulled from disk).

Suggested fix:
(Not sure where the problem is.)
[10 Jun 2009 11:33] d di
By the way,
 - block boundaries have been aligned all the way from the SAN through VMware to the partition used by the filesystem where the MYD files reside, on 2 MiB boundaries.
 - the underlying VMFS filesystem uses 8 MiB blocks, and the SAN underlying that uses 64 KiB chunks for the RAID10 arrays.  I'm not sure how the caching mechanism in the VMware kernel works, but it probably does some read-ahead.
[10 Jun 2009 11:45] d di
Oops.  It is the 11_lt column that contains the bulk of the data, not the 13_t column.  Nonetheless...
[10 Jun 2009 12:06] Valeriy Kravchuk
Thank you for the problem report. Please, send the results of

explain SELECT
  `1_ui32`,
  `2_dt`,
FROM
 `cm_un`
WHERE
 `12_ui8` = '0';
[10 Jun 2009 12:12] d di
Here is the requested EXPLAIN output.

id  select_type  table  type  possible_keys  key     key_len  ref     rows   Extra
----------------------------------------------------------------------------------
1   SIMPLE       cm_u   ALL   (NULL)         (NULL)  (NULL)   (NULL)  15985  Using where

(Also, the distribution is 99.7% of data in 11_lt, not 99.5% as earlier stated.)
[10 Jun 2009 13:48] Valeriy Kravchuk
Please, add index on  `12_ui8`  column you are using in WHERE and check with EXPLAIN if it will be used. Your query is forced to read entire table now.
[10 Jun 2009 14:09] d di
I should probably mention at this point that I'm looking for an answer for the specific question I posed in the original report.

The actual queries running against the table are many.  The queries pull out a subsection of all of the columns, not just the two columns mentioned in the example (naturally - otherwise there would be no need to *have* the columns at all, were they not being selected).

The SQL query I submitted was purposefully cut down to be as minimal as possible and does not exist in the real system.  It's purpose is to demonstrate a bug.  It is, in other words, completely without merit to try to optimize that particular query, by indexing all of the columns that it uses (namely the mentioned 12_ui8).

We need to know what the underlying reason for the problem is; why BLOB pages are spooled to RAM by the server when it does not need to do so to decode the fields that processing the query necessitates.

On a side note, if you run the given CREATE TABLE against your own 5.0.67 server, you can easily get the output of any EXPLAIN that you want to see.  Is it perhaps a bit wasteful and overly communicative to write a request in this issue every time you want to see a new explain, just to have me do it?
[10 Jun 2009 14:50] Valeriy Kravchuk
I try to understand what your real problem is. You can not expect for any query accessing all rows in the table to be fast. So, in general, you should create indexes on columns used in your real slow queries. The fewer rows you read from the tables, the faster your queries will be. If you read them ALL, queries will become slow eventually.

Anyway, based your comment now I see that your problem is: 

"why BLOB pages are spooled
to RAM by the server when it does not need to do so to decode the fields that processing
the query necessitates."
 
In this case you have MyISAM table. When access path ALL is used, entire .MYD file is read, row by row, including BLOB columns content. This is by design. So, again, you need to add proper indexes.
[10 Jun 2009 15:35] d di
> In this case you have MyISAM table. When access path ALL is used,
> entire .MYD file is read, row by row, including BLOB columns content.

Ah, there's the problem.  Thanks for helping me see it.

> This is by design.

This particular optimization is not part of the MyISAM design, fair enough.  I'll change the issue into a feature request.

(Uhm, but that sort of begs the question: if this optimization is not part of the design, why on earth bother putting BLOB in a separate place in the MYD file to begin with?)
[10 Jun 2009 19:39] Valeriy Kravchuk
> (Uhm, but that sort of begs the question: if this optimization is not part of the design,
> why on earth bother putting BLOB in a separate place in the MYD file to begin with?)

Why do you think that BLOB is in some separate place in the .MYD file? OK, BLOB data are stored separately when record is in memory already, but on disk BLOB is stored in record, just as VARCHAR, for example. Each record has variable length and thus server can not just skip to next record by position after reading only those few columns you selected.
[11 Jun 2009 13:11] d di
> Why do you think that BLOB is in
> some separate place in the .MYD file?

Good question.

There is already VARCHAR and VARBINARY for text strings and binary strings.  Why would you want special large (LOB) versions of those data types, if you're not going to treat them in any special way suitable for large objects anyway?

I assumed that they were going to be stored out of the way of normal operation, and streamed only when necessary.  Just seemed obvious, I guess..
[6 Aug 2009 11:04] Valeriy Kravchuk
Please, check ARCHIVE storage engine (http://dev.mysql.com/doc/refman/5.1/en/archive-storage-engine.html). It has the feature of skipping BLOB columns while reading if they are not needed.
[6 Aug 2009 12:01] d di
The ARCHIVE engine is wholly inappropriate for what we're using MySQL for, but thanks for the suggestion.
[27 Dec 2012 19:28] MySQL Verification Team
As mentioned, MyISAM can only read *entire* rows at a time, whether you requested one column or entire row.

Some info:
http://dev.mysql.com/worklog/task/?id=1655

http://dev.mysql.com/doc/refman/5.5/en/optimizing-queries-myisam.html
"It is normally not useful to split a table into different tables....."