Bug #44833 A float literal is not whitespace-terminated
Submitted: 12 May 2009 20:51 Modified: 14 May 2009 18:01
Reporter: Baron Schwartz (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.0.75, 5.0.77, 5.0.82 OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: qc

[12 May 2009 20:51] Baron Schwartz
Description:
select * from t where a >= 1.0order by a;

Does not cause an error.  I believe it should, because there should be a whitespace before ORDER BY.

How to repeat:
mysql> create table t(a int);
mysql> insert into t(a) values(1), (2);
mysql> select * from t where a >= 1.0order by a;
+------+
| a    |
+------+
|    1 | 
|    2 | 
+------+

mysql> explain extended 
    -> select * from t where a >= 1.0order by a;
mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t`.`a` AS `a` from `test`.`t` where (`test`.`t`.`a` >= 1.0) order by `test`.`t`.`a`
[13 May 2009 4:23] Valeriy Kravchuk
I think this is expected and intended behavior of MySQL lexer/parser when recognizing numeric literals. Look:

mysql> select 1.0order;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1

So, "order" was recognized as a keyword after numeric literal, but it is wrong to have nothing after "order". While in the following cases:

mysql> select 1.0o;
+-----+
| o   |
+-----+
| 1.0 | 
+-----+
1 row in set (0.00 sec)

mysql> select 1.0orde;
+------+
| orde |
+------+
|  1.0 | 
+------+
1 row in set (0.00 sec)

there is no problem. Next word after numeric literal is not a keyword, so it is used as column alias.

I am still checking if this is documented clearly anywhere.
[13 May 2009 12:14] Baron Schwartz
What does the SQL standard say?  (I would have checked it, but from past experience I anticipate that it will be a lot of work.)

One of the things I had in mind was that unless this is expected, documented and tested, a future change to the server could "fix" it silently and then queries would fail after an upgrade.
[13 May 2009 12:17] Baron Schwartz
By the way, even if it's not a bug, it's not consistently handled:

mysql> select 1.0from dual;
+-----+
| 1.0 |
+-----+
| 1.0 | 
+-----+
1 row in set (0.00 sec)

mysql> select 1from dual;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'dual' at line 1

mysql> select 1e0from dual;
+-----+
| 1e0 |
+-----+
|   1 | 
+-----+
1 row in set (0.00 sec)

mysql> select 1e1from dual;
+-----+
| 1e1 |
+-----+
|  10 | 
+-----+
1 row in set (0.00 sec)
[13 May 2009 12:51] Sveta Smirnova
Thank you for the feedback.

But why not consistently?

1from is correct name for identifier which can be used without quotes while 1.0from, 1e0from and 1e1from are not.
[13 May 2009 16:26] Baron Schwartz
1e0from should be a valid identifier that can be used without quotes, as far as I can see.  I think the parser is too greedy here.

The manual at http://dev.mysql.com/doc/refman/5.0/en/identifiers.html says

Identifiers may begin with a digit but unless quoted may not consist solely of digits. It is recommended that you do not use names of the form Me or MeN, where M and N are integers. For example, avoid using 1e or 2e2 as identifiers, because an expression such as 1e+3  is ambiguous. Depending on context, it might be interpreted as the expression 1e + 3 or as the number 1e+3.
[13 May 2009 16:54] Valeriy Kravchuk
I think that manual page definitely needs some fixes and clarifications. It says that both 1e and 2e2 are valid identifiers that should be avoided. But the reality is different:

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.82-debug | 
+--------------+
1 row in set (0.00 sec)

mysql> select 1 as 1e from dual;
+----+
| 1e |
+----+
|  1 | 
+----+
1 row in set (0.00 sec)

mysql> select 1 as 2e2 from dual;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '2e2 from dual' at line 1

So, either parser or manual should be fixed.
[14 May 2009 18:00] Paul DuBois
I asked Marc Alff, one of our developers who's worked with the parser, about this. His response:

"
Description:
select * from t where a >= 1.0order by a;

Does not cause an error.  I believe it should, because there should be a
whitespace before
ORDER BY.
"

This is legal and should not cause an error.
The server behaves properly here:
- "1" alone can not be an identifier, because it's followed by a "."
- therefore, the lexer parses "1" as the beginning of a number, and ends
up with "1.0" as one token.
The next token will be "order", a keyword.

There is nothing special to document here either.

What was reported is not a bug in the server, and is not a documentation
bug.

Now, later in this report, the syntaxes 1e0from and 1e1from seem to
cause a problem.

That might be considered a lexer bug since these could be a valid
identifier, but I would have to check what the official spec says for
this case.

In any case, changing the lexer code for such things is risky and can
break existing applications, not a good thing.

So,
I suggest that the doc changes:

"
It is recommended that you do not use names /of the form/ Me or MeN,
where M and N are integers
"

to

It is recommended that you do not use names /that begin with/ Me or MeN,
where M and N are integers
[14 May 2009 18:01] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

I made the change suggested by Marc.
[6 Mar 2013 15:30] Hartmut Holzgraefe
For the records: PostgreSQL behaves almost exactly the same, with just one exception:

  select 1from table;

works there, returning a single column with name and value "1" for each row in "table" while MySQL takes "1from" as an identifier and throws a parse error ...