Bug #49861 RANGE COLUMNS does not work as it should
Submitted: 21 Dec 2009 22:01 Modified: 29 Dec 2009 13:35
Reporter: Cyril SCETBON Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.5 OS:Linux (debian etch)
Assigned to: CPU Architecture:Any
Tags: partitioning, range columns
Triage: Needs Triage: D3 (Medium)

[21 Dec 2009 22:01] Cyril SCETBON
Description:
At http://dev.mysql.com/doc/refman/5.5/en/partitioning-range.html, it's said that when using more than one column, only the first one is taking into account. 

- But if it's true what the following order should produce ?

CREATE TABLE t1_col (
valeur TINYINT UNSIGNED NOT NULL,
quand DATE NOT NULL
)
PARTITION BY RANGE COLUMNS(valeur,quand) (
    PARTITION p0 VALUES LESS THAN (10,'2006-10-02'),
    PARTITION p1 VALUES LESS THAN (10,'2008-04-12'),
    PARTITION p2 VALUES LESS THAN (100,MAXVALUE),
    PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);

- why records with valeur=100 is stored in partition p2 ?? it should be stored in p3, as 100 is not less than 100, no ?

How to repeat:
see above

Suggested fix:
none
[22 Dec 2009 6:34] Sveta Smirnova
Thank you for the report.

Verified as described:

CREATE TABLE t1_col (
valeur TINYINT UNSIGNED NOT NULL,
quand DATE NOT NULL
)
PARTITION BY RANGE COLUMNS(valeur,quand) (
PARTITION p0 VALUES LESS THAN (10,'2006-10-02'),
PARTITION p1 VALUES LESS THAN (10,'2008-04-12'),
PARTITION p2 VALUES LESS THAN (100,MAXVALUE),
PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
insert into t1_col values(100, '2007-10-02'), (100, '2005-10-02'), (100, '2009-10-02');
explain partitions select * from t1_col;
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t1_col  p0,p1,p2,p3     ALL     NULL    NULL    NULL    NULL    3
explain partitions select * from t1_col where valeur=100 and quand = '2007-10-02';
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t1_col  p2      ALL     NULL    NULL    NULL    NULL    3       Using where
explain partitions select * from t1_col where valeur=100 and quand = '2005-10-02';
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t1_col  p2      ALL     NULL    NULL    NULL    NULL    3       Using where
explain partitions select * from t1_col where valeur=100 and quand = '2009-10-02';
id      select_type     table   partitions      type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  t1_col  p2      ALL     NULL    NULL    NULL    NULL    3       Using where
[22 Dec 2009 9:12] Mikael Ronström
The documentation is incorrect, the partitioning
decision is based on all fields in the partition
definition for partition based on columns. It will
first check the first field, if equal check the
second field.
[22 Dec 2009 9:43] Jon Stephens
This is not a bug, as far as a can tell.

The documentation says: 

1. *Placement of rows into partitions* is determined by only the first column listed in COLUMNS(), as demonstrated here:

jon@grindval:~/bin/mysql-trunk/bin> ./mysql -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.1-m2 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE DATABASE p;
Query OK, 1 row affected (0.04 sec)

mysql> USE p;
Database changed

mysql> CREATE TABLE rc1  (                                                                                                                                                                  ->   c1 INT,
    ->   c2 INT
    -> )
    -> PARTITION BY RANGE COLUMNS (c1, c2) (
    ->   PARTITION p0 VALUES LESS THAN (5, 5),
    ->   PARTITION p1 VALUES LESS THAN (10, 10),
    ->   PARTITION p2 VALUES LESS THAN (15, 15),
    ->   PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
    -> );
Query OK, 0 rows affected (0.08 sec)

mysql> INSERT INTO rc1 VALUES (3, 4), (3, 5), (3, 6), (3, 10), (3, 12);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM rc1;
+------+------+
| c1   | c2   |
+------+------+
|    3 |    4 |
|    3 |    5 |
|    3 |    6 |
|    3 |   10 |
|    3 |   12 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS
     >   FROM INFORMATION_SCHEMA.PARTITIONS
     >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          5 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.02 sec)

--> All rows are placed in p0, regardless of the value of c2, because c1 = 3 < 5 for every row that was inserted. This is exactly as stated in the documentation:

http://dev.mysql.com/doc/refman/5.5/en/partitioning-range.html -> "Placement of rows into partitions is determined by the first column
listed in the list supplied in COLUMNS(column_list) and the first
value in each VALUES LESS THAN  clause."

2. *Partition pruning* is determined by all columns listed in COLUMNS(), as shown in the submitter's own example, also as stated in the documentation:

http://dev.mysql.com/doc/refman/5.5/en/mysql-nutshell.html -> "For tables defined using these partitioning types, partition pruning can now optimize queries with WHERE conditions that use multiple comparisons between (different) column values and constants, such as a = 10 AND b > 5  or a < "2005-11-25" AND b = 10 AND c = 50."
[22 Dec 2009 9:51] Cyril SCETBON
yeah but in your sample the values are the same for both columns !
if you try with different value it does not seem to be working. How do you explain the fact that records with valeur=100 are stored in partition p2 in my sample if it's really based on the first column ?
[22 Dec 2009 10:13] Jon Stephens
mysql> CREATE TABLE rc2  (
    ->   c1 INT,
    ->   c2 INT
    -> )
    -> PARTITION BY RANGE COLUMNS (c1, c2) (
    ->   PARTITION p0 VALUES LESS THAN (5, 10),
    ->   PARTITION p1 VALUES LESS THAN (10, 20),
    ->   PARTITION p2 VALUES LESS THAN (15, 30),
    ->   PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
    -> );
Query OK, 0 rows affected (0.09 sec)

mysql>
mysql> INSERT INTO rc2 VALUES (3, 4), (3, 5), (3, 6), (3, 10), (3, 12);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    3 |    4 |
|    3 |    5 |
|    3 |    6 |
|    3 |   10 |
|    3 |   12 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          5 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

mysql> INSERT INTO rc2 VALUES (3, 14), (3, 15), (3, 16), (3, 20), (3, 22);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    3 |    4 |
|    3 |    5 |
|    3 |    6 |
|    3 |   10 |
|    3 |   12 |
|    3 |   14 |
|    3 |   15 |
|    3 |   16 |
|    3 |   20 |
|    3 |   22 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |         10 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)
[22 Dec 2009 10:19] Jon Stephens
My point is that the 2nd column value is not affecting placement.

Rather it appears, you've uncovered a bug where, in some cases, LESS THAN is being evaluated as '<=' instead of '<'.

I'm setting this back to Server:Partitioning/Verified and removing myself as assignee.
[22 Dec 2009 10:22] Jon Stephens
Observe:

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    4 |    4 |
|    4 |    5 |
|    4 |    6 |
|    4 |   10 |
|    4 |   12 |
|    4 |   14 |
|    4 |   15 |
|    4 |   16 |
|    4 |   20 |
|    4 |   22 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |         10 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    5 |    4 |
|    5 |    5 |
|    5 |    6 |
|    5 |   10 |
|    5 |   12 |
|    5 |   14 |
|    5 |   15 |
|    5 |   16 |
|    5 |   20 |
|    5 |   22 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          3 |
| p1             |          7 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.03 sec)

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    6 |   10 |
|    6 |   12 |
|    6 |   14 |
|    6 |   15 |
|    6 |   16 |
|    6 |   20 |
|    6 |   22 |
|    6 |    4 |
|    6 |    5 |
|    6 |    6 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |         10 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)
[22 Dec 2009 10:23] Cyril SCETBON
CREATE TABLE `rc3` (
  `c1` int(11) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL
) 
PARTITION BY RANGE  COLUMNS(c1,c2)
(
 PARTITION p0 VALUES LESS THAN (100,10),
 PARTITION p1 VALUES LESS THAN (150,20),
 PARTITION p2 VALUES LESS THAN (200,30),
 PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);

INSERT INTO rc3 VALUES (100, 4);

SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'rc3';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          1 | <---- you see that ?
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+

As Mickael said, storing is not based only on the first column or it's not working correctly. And as Sveta noticed it does not work as it should in my first sample (records with value=100, again I know...)
[22 Dec 2009 10:35] Jon Stephens
mysql> TRUNCATE TABLE rc1;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO rc1 VALUES (3, 4), (3, 5), (3, 6), (3, 10), (3, 12);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * FROM rc1;
+------+------+
| c1   | c2   |
+------+------+
|    3 |    4 |
|    3 |    5 |
|    3 |    6 |
|    3 |   10 |
|    3 |   12 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |         10 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.01 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          5 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

mysql> UPDATE rc1 SET c1 = c1 + 1;
Query OK, 5 rows affected (0.00 sec)
Rows matched: 5  Changed: 5  Warnings: 0

mysql> select * FROM rc1;
+------+------+
| c1   | c2   |
+------+------+
|    4 |    4 |
|    4 |    5 |
|    4 |    6 |
|    4 |   10 |
|    4 |   12 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          5 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

mysql> UPDATE rc1 SET c1 = c1 + 1;
Query OK, 5 rows affected (0.00 sec)
Rows matched: 5  Changed: 5  Warnings: 0

mysql> select * FROM rc1;
+------+------+
| c1   | c2   |
+------+------+
|    5 |    4 |
|    5 |    5 |
|    5 |    6 |
|    5 |   10 |
|    5 |   12 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          1 |
| p1             |          4 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

mysql> UPDATE rc1 SET c1 = c1 + 1;
Query OK, 5 rows affected (0.00 sec)
Rows matched: 5  Changed: 5  Warnings: 0

mysql> select * FROM rc1;
+------+------+
| c1   | c2   |
+------+------+
|    6 |    5 |
|    6 |    6 |
|    6 |   10 |
|    6 |   12 |
|    6 |    4 |
+------+------+
5 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |          5 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)
[22 Dec 2009 10:45] Cyril SCETBON
you said :

-> 

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    4 |    4 |
|    4 |    5 |
|    4 |    6 |
|    4 |   10 |
|    4 |   12 |
|    4 |   14 |
|    4 |   15 |
|    4 |   16 |
|    4 |   20 |
|    4 |   22 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE
TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |         10 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

I say -> I agree

you said :

->

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    5 |    4 |
|    5 |    5 |
|    5 |    6 |
|    5 |   10 |
|    5 |   12 |
|    5 |   14 |
|    5 |   15 |
|    5 |   16 |
|    5 |   20 |
|    5 |   22 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE
TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          3 |
| p1             |          7 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.03 sec)

I say : what's happening here ? if you say that only the first column is used to decide where to store data.

you said :

->

mysql> UPDATE rc2 SET c1 = c1 + 1;
Query OK, 10 rows affected (0.00 sec)
Rows matched: 10  Changed: 10  Warnings: 0

mysql> select * FROM rc2;
+------+------+
| c1   | c2   |
+------+------+
|    6 |   10 |
|    6 |   12 |
|    6 |   14 |
|    6 |   15 |
|    6 |   16 |
|    6 |   20 |
|    6 |   22 |
|    6 |    4 |
|    6 |    5 |
|    6 |    6 |
+------+------+
10 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE
TABLE_SCHEMA = 'p' AND TABLE_NAME = 'rc2';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |         10 |
| p2             |          0 |
| p3             |          0 |
+----------------+------------+
4 rows in set (0.00 sec)

Why having 6 instead of 5 change the way data is stored if only the first column is used and c1 in (5,6) should be stored in p1 ??

I just want to understand the way it works.

thx
[22 Dec 2009 11:40] Jon Stephens
Cyril,

Okay, part of what happened was (a) we had a disconnect between Development and Docs because not all parties have the same native language, (b) there was another disconnect because some folks were not precise in their descriptions, (c) I am sometimes a bit clueless, and (d) there is an actual (but subtle) software bug.

I'll have to think of a better way to write this in the documentation, but the rule might be expressed something like this,

"If the value for the first column in 2 partition definitions is the same, then the value for the second column helps determine the placement of the row" (and if the first and second column values in the definition are the same, then the value of the third column... and so on...)".
[22 Dec 2009 11:49] Jon Stephens
So if it's okay, I'll handle the documentation problem as BUG#49875, and leave this bug report for the "'<' vs '<=' problem", which (IIRC) can be summed up as follows:

When using RANGE COLUMNS partitioning, when the second or later column is used to determine which partition a row should be placed in, LESS THAN gets interpreted as  'less than or equal to'.

Example courtesy Giuseppe Maxia:

--------------
DROP TABLE IF EXISTS t1
--------------
Query OK, 0 rows affected

--------------
CREATE TABLE t1 (
          a int(11) DEFAULT NULL,
          b int(11) DEFAULT NULL
) ENGINE=MyISAM 
PARTITION BY RANGE  COLUMNS(a,b)
(PARTITION p1 VALUES LESS THAN (10,20) ,
 PARTITION p2 VALUES LESS THAN (10,30) ,
 PARTITION p3 VALUES LESS THAN (100,MAXVALUE), 
 PARTITION p4 VALUES LESS THAN (MAXVALUE,MAXVALUE))
--------------

Query OK, 0 rows affected

--------------
INSERT INTO t1 VALUES 
( 1, 10), (10, 10), (10, 20),
(10, 30), (10, 50), (20, 30)
--------------

Query OK, 6 rows affected
Records: 6  Duplicates: 0  Warnings: 0

--------------
select 
    partition_name part,
    partition_expression expr, 
    partition_description descr, 
    table_rows 
FROM 
    INFORMATION_SCHEMA.partitions 
WHERE 
    TABLE_SCHEMA = schema() 
    AND TABLE_NAME='t1'
--------------

+------+------+-------------------+------------+
| part | expr | descr             | table_rows |
+------+------+-------------------+------------+
| p1   | a,b  | 10,20             |          2 |
| p2   | a,b  | 10,30             |          1 |
| p3   | a,b  | 100,MAXVALUE      |          3 |
| p4   | a,b  | MAXVALUE,MAXVALUE |          0 |
+------+------+-------------------+------------+
4 rows in set

--------------
SELECT a,b,
CASE 
    WHEN a < 10 
        THEN 'p1'
    WHEN a <= 10 AND b < 20
        THEN 'p1'
    WHEN a <= 10 and b < 30
        THEN 'p2'
    WHEN a <= 100 and b < 0xfffff
        THEN 'p3'
    WHEN a <= 0xfffff
        THEN 'p4'
ELSE
    '-'
END AS p
FROM t1
--------------

+------+------+------+
| a    | b    | p    |
+------+------+------+
|    1 |   10 | p1   |
|   10 |   10 | p1   |
|   10 |   20 | p2   |
|   10 |   30 | p3   |
|   10 |   50 | p3   |
|   20 |   30 | p3   |
+------+------+------+
6 rows in set
[22 Dec 2009 13:25] Cyril SCETBON
thank you jon, I like this explanation !
[22 Dec 2009 16:11] Jon Stephens
Okay, I've been convinced that interpreting LESS THAN as '<=', although it is not intuitive, is indeed the correct behaviour.

So I'm closing this as Not A Bug, and I'll use BUG#49875 to handle fixing the documentation.
[23 Dec 2009 8:19] Cyril SCETBON
you're right, it is not really intuitive ...
[23 Dec 2009 8:20] Giuseppe Maxia
I am writing an article to explain the non-intuitive part. Stay tuned
[23 Dec 2009 9:09] Cyril SCETBON
Thanks, I'm already connected to Planet MySQL and I'll enjoy reading your article
[28 Dec 2009 9:43] Cyril SCETBON
Giuseppe,

I've read your article (http://dev.mysql.com/tech-resources/articles/mysql_55_partitioning.html), but I don't know if you're wrong or if it's a bug but it's not working as you say in the case where every first values are differents. To remember you say that these definition :

CREATE TABLE p1 (
  a INT,
  b INT,
  c INT
)
PARTITION BY RANGE COLUMNS (a,b)
(
  PARTITION p01 VALUES LESS THAN (10,20),
  PARTITION p02 VALUES LESS THAN (20,30),
  PARTITION p03 VALUES LESS THAN (30,40),
  PARTITION p04 VALUES LESS THAN (40,MAXVALUE),
  PARTITION p05 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);

can be substituted by :

CREATE TABLE p1_single (
  a INT,
  b INT,
  c INT
)
PARTITION BY RANGE COLUMNS (a)
(
  PARTITION p01 VALUES LESS THAN (10),
  PARTITION p02 VALUES LESS THAN (20),
  PARTITION p03 VALUES LESS THAN (30),
  PARTITION p04 VALUES LESS THAN (40),
  PARTITION p05 VALUES LESS THAN (MAXVALUE)
);

as values 10,20,30,40,MAXVALUE are differents. However if you insert values 10,20,30 in the two tables, the rows are going to different partitions :

INSERT INTO p1 VALUES(10,1,1),(20,1,1),(30,1,1);
INSERT INTO p1_single VALUES(10,1,1),(20,1,1),(30,1,1);
SELECT TABLE_NAME,PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME like 'p1%';
+------------+----------------+------------+
| TABLE_NAME | PARTITION_NAME | TABLE_ROWS |
+------------+----------------+------------+
| p1         | p01            |          1 |
| p1         | p02            |          1 |
| p1         | p03            |          1 |
| p1         | p04            |          0 |
| p1         | p05            |          0 |
| p1_single  | p01            |          0 |
| p1_single  | p02            |          1 |
| p1_single  | p03            |          1 |
| p1_single  | p04            |          1 |
| p1_single  | p05            |          0 |
+------------+----------------+------------+
10 rows in set (0.00 sec)

The operator LESS THAN seems to not be interpreted the same way in the two tables :(
[28 Dec 2009 21:05] Cyril SCETBON
waiting for more information
[29 Dec 2009 9:51] Susanne Ebrecht
All,

this is pretty clear:

LESS THAN means LESS THAN and not LESS EQUAL THAN.

We should not break mathematical rules at all.
[29 Dec 2009 12:53] Jon Stephens
I repeat, this is not a bug, and Cyril should not have reopened this bug report it simply because he did not understand or agree with Giuseppe's article. We don't accept bugs filed against DevZone articles.

Although Giuseppe consulted with Mikael and me while developing the DevZone article, I should also point out that the article is not definitive documentation. The MySQL Manual is intended to fulfil this purpose.

Giuseppe, Mikael, and I have discussed this at some length (and Cyril has also been party to some of this discussion).

Giuseppe, Mikael, and I have agreed that VALUES LESS THAN, in the case of multi-column partitioning (i.e. RANGE COLUMNS partitioning), is sometimes taken to mean 'less than or equal to' or -- more correctly -- 'not greater than'. Admittedly this is not a completely correct literal interpretation of the phrase 'less than'. However, with regard to the multi-column partitioning implemented in MySQL 5.5, it *is* suitable to handle VALUES LESS THAN as it is in fact handled, for the following reasons:

1. The interpretation as '<' or '<=' is in some ways a conditional one. It depends on the value lists used to define individual multi-column partitions making up a table partitioned by RANGE COLUMNS, and on the values making up a given tuple to be inserted into such a table.

2. The VALUES LESS THAN syntax is consistent with what is already implemented for defining tables partitioned by RANGE. The function of the clause as used in defining RANGE partitions is analogous to its function in defining RANGE COLUMNS partitions. However, while the functions are analogous, they are also distinct and there is no danger of overlap. Hence there is no need to provide a separate syntax for use in partition definition clauses in PARTITION BY RANGE COLUMNS.

3. A syntax such as VALUES LESS THAN OR EQUAL TO or even VALUES LESS THAN OR LESS THAN OR EQUAL TO might be more 'correct' but would be (IMO unnecessarily) cumbersome. A simpler alternative might be VALUES NOT GREATER THAN. But see (1): in order to be perfectly literally correct, the syntax would need to change according to the set of values to be inserted and its relations to the different sets of values defining individual RANGE COLUMNS partitions, which is simply not workable in any realistic sense (and perhaps not possible at all). In addition, the use of the OR and/or NOT keywords would very likely introduce problems with the parser (to say nothing of problems for humans required to maintain it).

Therefore (if I may repeat myself):

1. This is not a bug.

2. If a reader of Giuseppe's DevZone article has an issue with the article, he needs to take this up directly with Giuseppe, not by re-opening a bug report for a 'bug' which is in fact not a bug.

3. The Manual is intended as the source for authoritative information concerning MySQL. However, currently, the Manual's description of placement of rows into partitions of tables partitioned by RANGE COLUMNS is not correct. This constitutes a documentation bug, which has in fact been filed as BUG#49875.

4. BUG#49875 is assigned to me (and rightly so, since the errors in the documentation which it addresses are in fact errors that I made). The fixing of this bug is currently in progress, and is likely to be completed much more quickly if I'm not obligated to keep re-closing *this* bug report and re-explaining why it's not a bug. So please do not re-open the current bug again, because I will henceforth simply re-close it as !BUG without further explanation.

Thanks!
[29 Dec 2009 13:08] Giuseppe Maxia
I confirm: this is not a bug.
In the first release of my article, I made a mistake, which is going to be amended soon.
It is not true that a table with partitions defined by different first values is equivalent to a table partitioned on a single column. 
When you define a partition on more than one column, you are always evaluating a tuple, not single values.
Thus, the reasoning is not "10 LESS THAN 10", but "(10, 1) LESS THAN (10,10)".
And this tuple (10,1) is definitely LESS THAN (10,10).
[29 Dec 2009 13:35] Cyril SCETBON
Thanks Giuseppe, this is no more confused if tuples are compared and not values.
And sorry Jon I was really wondering if it was a BUG or not. However it will take you a little time to read this answer, oops :)
[29 Dec 2009 20:18] Jon Stephens
However, the more I work with this in attempting to write it up for the documentation, the more I'm confronted with the fact that

(a) VALUES LESS THAN is not strictly accurate

and

(b) Using VALUES LESS THAN is highly counter-intuitive.

These characteristics -- especially taken together -- tend to obscure the functionality and make it more difficult for users to employ it correctly.

I think that confusing users unnecessarily is a Very Bad Thing(TM).

Almost as bad as confusing MySQL Documentation, Community, and Support personnel. :)

As a starting point, I proposed a couple of possible alternatives, either which I think fits the behaviour of RANGE COLUMNS ranges much better than VALUES LESS THAN does:

1. VALUES NO GREATER THAN, e.g.:

CREATE TABLE t1 (a INT, b INT)
  PARTITION BY RANGE COLUMNS(a,b) (
    PARTITION p0 VALUES NO GREATER THAN (5,10),
    PARTITION p0 VALUES NO GREATER THAN (10,20),
    PARTITION p0 VALUES NO GREATER THAN (MAXVALUE,MAXVALUE)
);

(Using NO rather than NOT should avoid any issues with parser, I hope.)

2. RANGE BOUNDED BY, e.g.:

CREATE TABLE t2 (c INT, d INT)
  PARTITION BY RANGE COLUMNS(a,b) (
    PARTITION p0 RANGE BOUNDED BY (5,10),
    PARTITION p0 RANGE BOUNDED BY (10,20),
    PARTITION p0 RANGE BOUNDED BY (MAXVALUE,MAXVALUE)
);

I've sent a mail to Giuseppe, Susanne, and others with this proposal.
[2 Jan 2010 21:45] Jon Stephens
BUG#49875 has now been fixed, changes should appear in online docs shortly.