Bug #7734 ENUM behaves inconsistently to itself, and unlike enums generally
Submitted: 7 Jan 2005 17:25 Modified: 7 Jan 2005 20:58
Reporter: Margaret MacDonald Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:all (tested with 4.0.20a-nt) OS:Microsoft Windows (W2K)
Assigned to: CPU Architecture:Any
Triage: D4 (Minor)

[7 Jan 2005 17:25] Margaret MacDonald
Description:
I've listed this under 'doc' but it's really the language that should be fixed, and as 'serious' because it virtually disables a potentially very powerful language feature.

The ENUM(eration) construct exists in other languages, where it represents a named subset of integer.  Apart from the restricted namespace, the tokens behave exactly like symbolic integer constants.

In MySQL, ENUMs behave like integers only with respect to ORDER BY, and like strings elsewhere.  This is counterintuitive, and the documentation doesn't at all make this clear.

How to repeat:
1. declare a table that includes a column defined as ENUM ('mumble','fratz','foo');

2. create a number of records that cover all three values.

3. issue SELECT enumcol FROM tablename ORDER BY enumcol ;
The result will be ordered 'mumble','fratz','foo'

4. issue SELECT enumcol FROM tablename WHERE enumcol < 'fratz'
The dataset will contain records where enumcol = 'foo', not 'mumble'.

Suggested fix:
That the interpreter be fixed such that it evaluates ENUMs in every context, not just in ORDER BY.

I'd also suggest that a special bit be allocated, much like the length value for a VARCHAR, to mark the 'not in this namespace' value, so that the tokens themselves can evaluate from zero.  Economically this could be the high-order bit, such that ENUMs can have up to 63 rather than 64 values.
[7 Jan 2005 20:54] Alexander Keremidarski
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

"The ENUM(eration) construct exists in other languages, where it represents a
named subset of integer."

This is exacly what ENUM() is in MySQL. In my humble opinion the same defintion can be derived from the MySQL Manual:
http://dev.mysql.com/doc/mysql/en/ENUM.html
11.4.4 The ENUM Type

<quote>

An ENUM is a string object with a value chosen from a list of allowed values that are enumerated explicitly in the column specification at table creation time.
...
Each enumeration value has an index:
    * Values from the list of allowable elements in the column specification are numbered beginning with 1.
....
ENUM values are sorted according to the order in which the enumeration members were listed in the column specification. 

<end of quote>

"In MySQL, ENUMs behave like integers only with respect to ORDER BY, and like
strings elsewhere."

Yes and as you can see above this is well documented behaviour.

As for ENUM ('mumble','fratz','foo') your logic is quite wrong. You say:

"SELECT enumcol FROM tablename WHERE enumcol < 'fratz'
The dataset will contain records where enumcol = 'foo', not 'mumble'."

This is most natural result for the human reader of the query. The string comparison rules dictate that the string 'foo' is the only one which satisfies the condition enumcol < 'fratz'

It is as consistant as possible because the result in this case does *not* depend on the column type. 

The DBA is then allowed to alter between ENUM and CHAR|VARCHAR without need to care about the client applicatons.

This conforms to the very important principle of Relational model that there is a high level of abstraction between the way data is stored into the database at the server side and the way the data is represented to the client side. 

The transition between these two layers should be as transaprent as possible and this is exactly what MySQL allows with the current ENUM behaviour.

Also the DBA has full control over the sorting behaviour. It is enough to enumerate the values in alphabetical order and the ENUM becomes equivalent to CHAR|VARCHAR. This is clearly explained in the manual. Even if the strings are not enumareted in alhabetical order it is still possible to sort them in the alphabetical order.
At the same time if it is still possible to write a WHERE clause which matches the values according to their ENUM_index and not their String value. This requires knowing the exact definition of the column and the "index values" which of course negates the flexibility, but if we demand that 'mumble' < 'fratz' this is already too specific s there is no conflict.

Using your example:

mysql> SELECT * FROM en ORDER BY en;
+--------+
| en     |
+--------+
| mumble |
| fratz  |
| foo    |
+--------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM en ORDER BY CAST(en AS CHAR);
+--------+
| en     |
+--------+
| foo    |
| fratz  |
| mumble |
+--------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM en WHERE en < 2;
+--------+
| en     |
+--------+
| mumble |
+--------+
1 row in set (0.00 sec)

As you can see the abstraction the current ENUM behaviour provides is even higher. It can be used both ways.
1) As string
2) As "named subset of integer"

Your proposition restricts the usage to the second case only. Why restricting a flexible feature?
[7 Jan 2005 20:58] Alexander Keremidarski
Just to add two more examples to make the relation between the ENUM values and their enumeration value more clear:

mysql> SELECT CAST(en AS UNSIGNED) AS i, en FROM en ORDER BY en;
+------+--------+
| i    | en     |
+------+--------+
|    1 | mumble |
|    2 | fratz  |
|    3 | foo    |
+------+--------+
3 rows in set (0.00 sec)

mysql> SELECT CAST(en AS UNSIGNED) AS i, en FROM en ORDER BY CAST(en AS CHAR);
+------+--------+
| i    | en     |
+------+--------+
|    3 | foo    |
|    2 | fratz  |
|    1 | mumble |
+------+--------+
3 rows in set (0.00 sec)