Bug #18925 Dependant, subquery on information_schema.COLUMNS returns wrong result
Submitted: 9 Apr 2006 19:21 Modified: 7 Aug 2006 8:00
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.21-BK, 5.0.18, 5.1.7 OS:Linux (Linux, win xp pro)
Assigned to: Igor Babaev CPU Architecture:Any

[9 Apr 2006 19:21] Roland Bouman
Description:
A SELECT expression with a subquery that is bound to the outer query over multiple columns returns different results when a constant is used instead of one of the dependant columns. 

It can be shown that in the original query, that dependant column maintains a value that is exactly equal to the substituted constant. 

So, it was expected that there would be no difference between the results returned by these queries.

It is my hunch that this really has to do with the information schema, because when I copy the data from the information schema to another database, both queries do return the same result.

How to repeat:
original query. BTW, it took > 30 secs on my machine - much too long. 
The subquery is dependant upon the values of t.table_schema and t.table_name:

select      t.table_name
,           c1.column_name
from        information_schema.tables  t
inner join  information_schema.columns c1
on          t.table_schema = c1.table_schema
and         t.table_name   = c1.table_name
where       t.table_schema = 'information_schema'
and         c1.ordinal_position = (
                select  coalesce(min(c2.ordinal_position),1)
                from    information_schema.columns c2
                where   c2.table_schema = t.table_schema
                and     c2.table_name   = t.table_name
                and     c2.column_name like '%SCHEMA%'
            )
;

+---------------------------------------+--------------------+
| table_name                            | column_name        |
+---------------------------------------+--------------------+
| CHARACTER_SETS                        | CHARACTER_SET_NAME |
| COLLATIONS                            | COLLATION_NAME     |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME     |
| COLUMNS                               | TABLE_SCHEMA       |
| COLUMN_PRIVILEGES                     | TABLE_SCHEMA       |
| ENGINES                               | ENGINE             |
| EVENTS                                | EVENT_SCHEMA       |
| FILES                                 | TABLE_SCHEMA       |
| KEY_COLUMN_USAGE                      | CONSTRAINT_SCHEMA  |
| PARTITIONS                            | TABLE_SCHEMA       |
| PLUGINS                               | PLUGIN_NAME        |
| PROCESSLIST                           | ID                 |
| ROUTINES                              | ROUTINE_SCHEMA     |
| SCHEMATA                              | SCHEMA_NAME        |
| SCHEMA_PRIVILEGES                     | TABLE_SCHEMA       |
| STATISTICS                            | TABLE_SCHEMA       |
| TABLES                                | TABLE_SCHEMA       |
| TABLE_CONSTRAINTS                     | CONSTRAINT_SCHEMA  |
| TABLE_PRIVILEGES                      | TABLE_SCHEMA       |
| TRIGGERS                              | TRIGGER_SCHEMA     |
| USER_PRIVILEGES                       | GRANTEE            |
| VIEWS                                 | TABLE_SCHEMA       |
+---------------------------------------+--------------------+

The intention of this original query is to find the first column where name is like '%schema%' for each table in the information_schema, and if that does not exist, the first column in the table. You can see that this query returns the right result.

Now, I try to get it faster.

Because in the outer query requires that t.table_schema equals the constant 'information_schema', it should be possible to substitute that constant in the subquery too without altering the result:

select      t.table_name
,           c1.column_name
from        information_schema.tables  t
inner join  information_schema.columns c1
on          t.table_schema = c1.table_schema
and         t.table_name   = c1.table_name
where       t.table_schema = 'information_schema'
and         c1.ordinal_position = (
                select  coalesce(min(c2.ordinal_position),1)
                from    information_schema.columns c2
                where   c2.table_schema = 'information_schema'
                and     c2.table_name   = t.table_name
                and     c2.column_name like '%SCHEMA%'
            )
;

+---------------------------------------+--------------------+
| table_name                            | column_name        |
+---------------------------------------+--------------------+
| CHARACTER_SETS                        | CHARACTER_SET_NAME |
| COLLATIONS                            | COLLATION_NAME     |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME     |
| COLUMNS                               | TABLE_CATALOG      |
| COLUMN_PRIVILEGES                     | GRANTEE            |
| ENGINES                               | ENGINE             |
| EVENTS                                | EVENT_CATALOG      |
| FILES                                 | FILE_ID            |
| KEY_COLUMN_USAGE                      | CONSTRAINT_CATALOG |
| PARTITIONS                            | TABLE_CATALOG      |
| PLUGINS                               | PLUGIN_NAME        |
| PROCESSLIST                           | ID                 |
| ROUTINES                              | SPECIFIC_NAME      |
| SCHEMATA                              | CATALOG_NAME       |
| SCHEMA_PRIVILEGES                     | GRANTEE            |
| STATISTICS                            | TABLE_CATALOG      |
| TABLES                                | TABLE_CATALOG      |
| TABLE_CONSTRAINTS                     | CONSTRAINT_CATALOG |
| TABLE_PRIVILEGES                      | GRANTEE            |
| TRIGGERS                              | TRIGGER_CATALOG    |
| USER_PRIVILEGES                       | GRANTEE            |
| VIEWS                                 | TABLE_CATALOG      |
+---------------------------------------+--------------------+

this executes in less than a second, but now the first column for each table is returned - Most probably the subquery does not return any records, after which the coalesce defaults to returning 1.

Suggested fix:
Please fix the subquery.
[10 Apr 2006 14:17] Valeriy Kravchuk
Verified just as described on 5.0.21-BK (ChangeSet@1.2143, 2006-04-10 08:29:24+02:00) on Linux.
[21 Jul 2006 23:30] Igor Babaev
This bug manifests itself on a very simple query:

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.23-debug |
+--------------+
1 row in set (0.00 sec)

mysql> select table_name from information_schema.tables
    ->   where table_name=(select max(table_name)
    ->                       from information_schema.tables);
Empty set (0.01 sec)
[22 Jul 2006 17:29] Igor Babaev
It has turned out that wrong results in the reported query and the sumple query I brought up in my previous comment are caused by different problems in the code. More exactly the fix for my simple query does not fix the reported
query and vice versa.
That's why I report the problem with my simple query as a separate bug #21231:
"A wrong result for a simple query with a subquery on information schema".
[22 Jul 2006 19:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/9454
[24 Jul 2006 5:40] Sergei Glukhov
ok to push
[2 Aug 2006 18:56] Evgeny Potemkin
Execution of some correlated subqueries may set the value
of null_row to 1 for tables used in the subquery.
If the the subquery is on information schema it causes
rejection of any row for the following executions of
the subquery in the case when an optimization filtering
by some condition is applied.

Fixed in 5.0.25, 5.1.12
[7 Aug 2006 8:00] Jon Stephens
Documented bugfix in 5.0.25 and 5.1.12 changelogs.
[27 Jun 2008 15:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/48656

2657 Konstantin Osipov	2008-06-27
      Move the part of information_schema.test that takes 90% of the time
      to information_schema-big.test.
      This test case accounted for information_schema.test timeout on
      powermacg5, which has a very slow file system.