Bug #75753 All MyISAM partitions opened, despite optimizer's partition pruning
Submitted: 3 Feb 2015 17:15 Modified: 6 Feb 2015 12:35
Reporter: Geoff Montee Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.1+ OS:Any
Assigned to: Jon Stephens CPU Architecture:Any

[3 Feb 2015 17:15] Geoff Montee
Description:
In certain cases, the optimizer can eliminate partitions from a query plan, based on conditions in the WHERE clause. This is called partition pruning:

http://dev.mysql.com/doc/refman/5.5/en/partitioning-pruning.html

However, even when partition pruning is in use, the MYI and MYD files of all MyISAM partitions in a given table are opened, even if some of them won't be read.

This causes problems for users who have tables with a lot of MyISAM partitions. A simply query targeting one partition of a table with 100 partitions can easily create thousands of file descriptors if there are a few concurrent users. 

How to repeat:
I'll use one of the example tables from the partition pruning documentation:

CREATE TABLE t2 (
    fname VARCHAR(50) NOT NULL,
    lname VARCHAR(50) NOT NULL,
    region_code TINYINT UNSIGNED NOT NULL,
    dob DATE NOT NULL
) ENGINE=MyISAM
PARTITION BY RANGE( YEAR(dob) ) (
    PARTITION d0 VALUES LESS THAN (1970) ENGINE=MyISAM,
    PARTITION d1 VALUES LESS THAN (1975) ENGINE=MyISAM,
    PARTITION d2 VALUES LESS THAN (1980) ENGINE=MyISAM,
    PARTITION d3 VALUES LESS THAN (1985) ENGINE=MyISAM,
    PARTITION d4 VALUES LESS THAN (1990) ENGINE=MyISAM,
    PARTITION d5 VALUES LESS THAN (2000) ENGINE=MyISAM,
    PARTITION d6 VALUES LESS THAN (2005) ENGINE=MyISAM,
    PARTITION d7 VALUES LESS THAN MAXVALUE ENGINE=MyISAM
);

If we insert a row:

INSERT INTO t2 VALUES ('John', 'Doe', 1, '1982-06-23');

Now let's SELECT from the table. We know partition pruning will be used in this particular query:

mysql> EXPLAIN PARTITIONS SELECT * FROM t2 WHERE dob = '1982-06-23';
+----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
| id | select_type | table | partitions | type   | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | t2    | d3         | system | NULL          | NULL | NULL    | NULL |    1 |       |
+----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

The optimizer says the results will only be in partition "d3", but let's see what files MySQL actually opens when the query is executed.

Shell 1:

mysql> FLUSH TABLES;
Query OK, 0 rows affected (0.00 sec)

Shell 2:

[gmontee@localhost ~]$ sudo strace -p `pidof mysqld` -ff -o `pidof mysqld`.out -e trace=open
Process 4295 attached with 17 threads

Shell 1:

mysql> SELECT * FROM t2 WHERE dob = '1982-06-23';
+-------+-------+-------------+------------+
| fname | lname | region_code | dob        |
+-------+-------+-------------+------------+
| John  | Doe   |           1 | 1982-06-23 |
+-------+-------+-------------+------------+
1 row in set (0.00 sec)

Shell 2:

^CProcess 4295 detached
Process 4297 detached
Process 4298 detached
Process 4299 detached
Process 4300 detached
Process 4301 detached
Process 4302 detached
Process 4303 detached
Process 4304 detached
Process 4305 detached
Process 4306 detached
Process 4308 detached
Process 4309 detached
Process 4310 detached
Process 4311 detached
Process 4316 detached
Process 4319 detached
[gmontee@localhost ~]$ cat 4295.out.4319
open("./tmp/t2.frm", O_RDONLY)          = 16
open("./tmp/t2.par", O_RDONLY)          = 17
open("./tmp/t2.par", O_RDONLY)          = 16
open("/var/lib/mysql/tmp/t2#P#d0.MYI", O_RDWR) = 16
open("./tmp/t2#P#d0.MYD", O_RDWR)       = 17
open("/var/lib/mysql/tmp/t2#P#d1.MYI", O_RDWR) = 18
open("./tmp/t2#P#d1.MYD", O_RDWR)       = 19
open("/var/lib/mysql/tmp/t2#P#d2.MYI", O_RDWR) = 20
open("./tmp/t2#P#d2.MYD", O_RDWR)       = 21
open("/var/lib/mysql/tmp/t2#P#d3.MYI", O_RDWR) = 22
open("./tmp/t2#P#d3.MYD", O_RDWR)       = 23
open("/var/lib/mysql/tmp/t2#P#d4.MYI", O_RDWR) = 24
open("./tmp/t2#P#d4.MYD", O_RDWR)       = 25
open("/var/lib/mysql/tmp/t2#P#d5.MYI", O_RDWR) = 26
open("./tmp/t2#P#d5.MYD", O_RDWR)       = 27
open("/var/lib/mysql/tmp/t2#P#d6.MYI", O_RDWR) = 28
open("./tmp/t2#P#d6.MYD", O_RDWR)       = 29
open("/var/lib/mysql/tmp/t2#P#d7.MYI", O_RDWR) = 30
open("./tmp/t2#P#d7.MYD", O_RDWR)       = 31

Suggested fix:
If partition pruning is in use, it is a waste to open partitions that will not be checked.
[4 Feb 2015 10:32] MySQL Verification Team
Hello Geoff Montee,

Thank you for the bug report.
Imho this is current and original design, the open/close of partitions works the same way as open/close of tables (since partitions can be seen as sub-tables). So all partitions are opened when the table is opened and the table stays open until it is flushed from the table cache. 

Also, explained nicely by Mattias Jonsson in Bug #64498

"This is a design limitation, MyISAM have one file handle open for the .MYI file shared by all instances of one table, and one file handle open for each table instance for the .MYD file. And the ha_partition generic partitioning engine always open all partitions when opening a table/handler".. more details in #64498

Imho only thing which is missing is explicit details on this in relevant pages of manual.

Marking this as doc bug for now.

Thanks,
Umesh
[6 Feb 2015 12:35] Jon Stephens
This is a duplicate of BUG#64498.