Bug #29816 Syntactically wrong query fails with misleading error message
Submitted: 16 Jul 2007 11:46 Modified: 24 Oct 2007 19:13
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Parser Severity:S3 (Non-critical)
Version:5.0.41-debug-log, 5.0, 5.1 BK OS:Any
Assigned to: Davi Arnaut CPU Architecture:Any

[16 Jul 2007 11:46] Philip Stoev
Description:
If no default database is selected and a query contains a syntax error, it will fail with a different message depending on the presence of a function call before the syntax error. Instead of a "syntax error", a "no database selected" message will be returned.

In my humble opinion this should not be so -- a syntactically invalid query is always invalid regardless of the presence of a default database for a connection and regardless of the presence of any functions it mentions.

How to repeat:
shell> mysql

mysql> SELECT bozo(), 1 + ,;
ERROR 1046 (3D000): No database selected

shell> mysql test
mysql> SELECT bozo(), 1 + ,;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1

Suggested fix:
It appears that the parser needs a database in order to lookup the function mentioned and determine how many arguments it has before it can continue parsing. Is it possible to change the parser so that the rules FUNC_ARG0, FUNC_ARG1 , etc. are retired and an arbitary number of arguments are allowed for each function?

From what I undertstand from the manual, it appears that an identifier immediately followed by a opening parenthesis is unambiguously parsed as a function name, therefore in this case no confusion can occur with any table or field names.
[16 Jul 2007 12:03] Sveta Smirnova
Thank you for the report.

Verified as described.

Version 4.1 is not affected.

Especcially strange if use full qualified name with database which not exists:

$mysql50
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 939
Server version: 5.0.46-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SELECT bozo(), 1 + ,;
ERROR 1046 (3D000): No database selected
mysql> SELECT test3333.bozo(), 1 + ,;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
mysql> use test3333;
ERROR 1049 (42000): Unknown database 'test3333'
[11 Aug 2007 12:43] Konstantin Osipov
Thank you for your bug report.
This is a duplicate of Bug#28318 CREATE FUNCTION (UDF) requires a schema
[16 Oct 2007 21:36] Marc ALFF
Hi Philip

It is correct that:
  mysql> SELECT bozo(), 1 + ,;
  ERROR 1046 (3D000): No database selected
is a confusing error message.

Conceptually, a parser typically performs:
- lexical parsing
- syntactic parsing
- semantic checks

In an academic environment, one might implement a parser by executing each
passes independently.
Note however, that the parser embedded inside the mysql server is primarily
designed to produce an executable tree for the runtime, and this cause code
for several phases to be executed at the same time.

For example, the parser does not process all the lexical parsing before
starting with syntactic parsing. Instead, the lexer (sql_lex) is called as
needed to feed the syntactical parser (sql_yacc) with tokens.

Another example is that the parser does perform some semantic checks,
like in this case resolving names to functions calls, as soon as a syntax
fragment is detected. This cause some semantic analysis to be executed
before the entire syntax has been accepted.

As a result, when there are multiple errors with a query, like:
- a syntax error in one fragment
- a semantic error in another fragment
the parser might return either the first error or the second.
This is not considered a bug, since both errors are valid.
The choice of which error to report is ultimately implementation dependent.

To achieve what is described in this bug report, the implementation of the
parser would have to:
- perform the entire syntax analysis first, and build a syntax tree
- only after a complete syntax tree is generated, proceed with semantic
  analysis.

While this approach is valid, it is driven by a design constraint which is
external to the server implementation (to perform query rewrites on a syntax
tree).

With the current design (which aims for performance, not code modularity),
there is no way to represent an arbitrary function call to something unknown
(this would require an Item_func_generic), so the parser has no way to
produce a "syntax" tree only, to detect all syntax errors in a separate
pass. Generating an Item_func_udf or an Item_func_sp in the example given
would be wrong, since "select inexistent()" uses neither.

To summarize, there are two aspects with this report:
- a confusing message can be given in some cases, this has been fixed.
- the constraint of separating the different phases of parsing is not
  a design requirement for the server, and refactoring the parser to meet
  this requirement will not be done.

About FUNC_ARG0, FUNC_ARG1 and similar, please look at the files
- sql/item_create.h
- sql/item_create.cc
in 5.1 and above.
You will find there that functions with a variable number of arguments
can be easily represented there.

Thanks for your interest in MySQL.

Regards,
-- Marc
[17 Oct 2007 5:01] Marc ALFF
Fixed by the patch for bug#28318
[19 Oct 2007 18:52] Bugs System
Pushed into 5.1.23-beta
[19 Oct 2007 18:54] Bugs System
Pushed into 5.0.52
[24 Oct 2007 19:13] Paul DuBois
Noted in 5.0.52, 5.1.23 changelogs.

The parser confused user-defined function (UDF) and stored function
creation for CREATE FUNCTION and required that there be a default
database when creating UDFs, although there is no such requirement.