Bug #35312 REF_CNSTRS.CONSTRAINT_NAME to STATISTIC.INDEX_NAME can be broken
Submitted: 15 Mar 2008 17:53 Modified: 19 Mar 2008 0:20
Reporter: Adrian Colomitchi Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Information schema Severity:S2 (Serious)
Version:5.1.23-rc-community, 5.0, 5.1, 6.0 BK OS:Any
Assigned to: CPU Architecture:Any

[15 Mar 2008 17:53] Adrian Colomitchi
Description:
If, before creating a Foreign Key, an INDEX is created on the referencing (i.e. non unique) column(s), the expected "REFERENTIAL_CONSTRAINTS.CONSTRAINT_NAME=STATISTIC.INDEX_NAME" relationship is broken.
As a consequence, *by only using information_schema tables*, it is impossible to recover the (master_columns, detail_columns) relationship.

How to repeat:
To reproduce (sql dump will be attached, but here's a description):
1. create table A, with 2 columns (say, A.`serial_no` and A.`version`);
2. create an UNIQUE INDEX (say, `A_SERIAL`) on the two columns
3. create table B, with at least two columns, matching the type of the 2 columns at step 1 (say, B.`a_serial_no` and B.`a_version`)
4. create an INDEX on the two columns of B ( CREATE INDEX `to_a` ON B(`a_serial_no`, `a_version`) )
5. create a Foreign Key between B(`a_serial_no`, `a_version`) and A(`serial_no`, `version`). To explore the issue in full, in creating the FK, use a name (say `fk_to_a`) that's different than the one used for the index created at step 4 (i.e. `to_a`)

Notes:
- on step 4, the `to_a` index does not figure in the information_schema.TABLE_CONSTRAINTS: the reaction is "as expected" since the `to_a` index is not unique
- (*the failure*) on step 5, the `to_a` index still does not contribute with any information in the information_schema.TABLE_CONSTRAINTS, *even it is now used in/by constraint*

-- *** useful scripts. Part 1: table creations **
CREATE DATABASE IF NOT EXISTS test;
USE test;

-- Definition of table `a`
DROP TABLE IF EXISTS `a`;
CREATE TABLE `a` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `serial_no` varchar(16) NOT NULL,
  `version` varchar(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `a_serial` (`serial_no`,`version`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- Definition of table `b`
DROP TABLE IF EXISTS `b`;
CREATE TABLE `b` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `a_serial_no` varchar(16) NOT NULL,
  `a_version` varchar(45) NOT NULL,
  `name` varchar(45) NOT NULL,
  PRIMARY KEY (`id`),
-- comment the following line and compare the info in TABLE_CONSTRAINTS
  KEY `to_a` (`a_serial_no`,`a_version`),
  CONSTRAINT `fk_to_a` FOREIGN KEY (`a_serial_no`, `a_version`) REFERENCES `a` (`serial_no`, `version`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- *** useful scripts. Part 2: getting columns in a FK relationship **
drop view if exists `constraint_columns`;
create view constraint_columns as
select
  S.TABLE_SCHEMA,
  S.TABLE_NAME,
  TC.CONSTRAINT_NAME,
  S.COLUMN_NAME,
  S.SEQ_IN_INDEX
FROM
  information_schema.STATISTICS S,
  information_schema.TABLE_CONSTRAINTS TC
WHERE
  S.TABLE_SCHEMA=TC.TABLE_SCHEMA and
  S.TABLE_NAME=TC.TABLE_NAME and
-- the next condition will fail to detect the columns in the `to_a` index
  S.INDEX_NAME=TC.CONSTRAINT_NAME and
  S.INDEX_SCHEMA=TC.CONSTRAINT_SCHEMA
ORDER BY
  S.TABLE_SCHEMA ASC,
  S.TABLE_NAME ASC,
  S.INDEX_NAME ASC,
  S.SEQ_IN_INDEX ASC
;

Suggested fix:

However, the issue is more profound. To illustrate, perform the scenario above, *skipping over step 4*. The result will be the creation of a (non-unique) index (in STATISTICS) named `fk_to_a`, as well as declaring the index info into TABLE_CONSTRAINTS and REFERENTIAL_CONSTRAINTS.
In the original scenario, 

1. "false solution" (i.e. hope you won't go there) - if the FK detects an already existing "detail" index that can be used, rename the "detail" index to the name of the foreign key. Why this is wrong: if I create an index under one name, I expect to be able to drop the index using the same name.

2. *the proper fix* would imply adding a new column (suggested CONSTRAINT_KEY_NAME) to the REFERENTIAL_CONSTRAINTS, to contain the name of the INDEX to be used in determining the "detail" columns. 

3. the only workaround that I can imagine to the issue is to:
- issue a "SHOW CREATE TABLE..." statement...
- ... parse the returned statement to recover the column correspondence
which:

a. its opposite with the declared intentions (search for "The SELECT ... FROM INFORMATION_SCHEMA statement is intended" in http://dev.mysql.com/doc/refman/5.1/en/information-schema.html)

b. requires the implementation of a parser in a client program (hell! It should reach this far).
[17 Mar 2008 16:25] Sveta Smirnova
Thank you for the report.

Verified as described.

Maybe better problem is showing if take in account next output:

With KEY `to_a` (`a_serial_no`,`a_version`):

select  INDEX_NAME, INDEX_SCHEMA from information_schema.STATISTICS where TABLE_SCHEMA='test' and TABLE_NAME = 'b';
INDEX_NAME      INDEX_SCHEMA
PRIMARY test
to_a    test
to_a    test

Without KEY `to_a` (`a_serial_no`,`a_version`):

select  INDEX_NAME, INDEX_SCHEMA from information_schema.STATISTICS where TABLE_SCHEMA='test' and TABLE_NAME = 'b';
INDEX_NAME      INDEX_SCHEMA
PRIMARY test
fk_to_a test
fk_to_a test
[17 Mar 2008 17:35] Valeriy Kravchuk
I think this is not a bug. You explicitely set name for the INDEX, and explicitely set different name for the CONSTRAINT. Why do you expect them to bethe same eventually?
[18 Mar 2008 13:13] Adrian Colomitchi
Hi Valeriy,

I might be wrong in what I'll be saying below: if this is the case, I apologize in advance.

> You explicitely set name for the INDEX, and 
> explicitely set different name for the CONSTRAINT. 
> Why do you expect them to bethe same eventually?
By stating that I explicitly created the FK and INDEX with a different name, you state on a false track.
The problem is *not* that the names mismatch. The problem is that if someone creates the index then the FK with different names, you have no ways of detecting *what columns in the detail table corresponds to which columns in the master table*!!

Suppose that the database is not created by me, I am only requested to recover the metadata of the database (including the FK definition). 
Suppose that I need to get this metadata not for the purposes of being read by human eyes (or for the purposes of backup/restore the database) but further processing of this information (think database diagramming, or automatic code generation, or whatever purposes for which an SQL script is not enough).

To understand better, try to do the following exercise:
------- exercise start -------
a. create the FK without any index. Wrote a set of SQL select statement (against the tables in the INFO_SCHEMA) to come with a structure of the result in the form (FK_name, MASTER_TABLE_NAME, MASTER_COLUMN_NAME, DETAIL_TABLE_NAME, DETAIL_COLUMN_NAME) - i.e. list all the FK-es together with the unique.column<->fk_column correspondence.

b. now, before creating the FK, create an index on the detail table on the same columns that will be involved in the FK. Repeat the same procedure you derived at point a. in this case.
---- exercise end ----

I bet that, in the scenario b. you will fail. *This* failure is the bug.

The scripts that I posted in the initial comment of the issue are only to diagnose and demonstrate the bug.
My conclusion is that the problem arises from the underlaying assumption that "for a foreign key and its associated index, the TABLE_CONSTRAINT.CONSTRAINT_NAME is always the same as the STATISTICS.INDEX_NAME" that fails to be true when the index is created in advance under another name. 

The correct fix, in my opinion, would be to add another column to the TABLE_CONSTRAINTS (say CONSTRAINT_KEY_NAME) to say "the FK with the CONSTRAINT_NAME is using the CONSTRAINT_KEY_NAME index".

Best regards,

Adrian
[18 Mar 2008 20:18] Valeriy Kravchuk
Read http://dev.mysql.com/doc/refman/5.0/en/key-column-usage-table.html. Then just try something like:

mysql> select * from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where table_schema='tes
t' and table_name='b'\G
*************************** 1. row ***************************
           CONSTRAINT_CATALOG: NULL
            CONSTRAINT_SCHEMA: test
              CONSTRAINT_NAME: PRIMARY
                TABLE_CATALOG: NULL
                 TABLE_SCHEMA: test
                   TABLE_NAME: b
                  COLUMN_NAME: id
             ORDINAL_POSITION: 1
POSITION_IN_UNIQUE_CONSTRAINT: NULL
      REFERENCED_TABLE_SCHEMA: NULL
        REFERENCED_TABLE_NAME: NULL
       REFERENCED_COLUMN_NAME: NULL
*************************** 2. row ***************************
           CONSTRAINT_CATALOG: NULL
            CONSTRAINT_SCHEMA: test
              CONSTRAINT_NAME: fk_to_a
                TABLE_CATALOG: NULL
                 TABLE_SCHEMA: test
                   TABLE_NAME: b
                  COLUMN_NAME: a_serial_no
             ORDINAL_POSITION: 1
POSITION_IN_UNIQUE_CONSTRAINT: 1
      REFERENCED_TABLE_SCHEMA: test
        REFERENCED_TABLE_NAME: a
       REFERENCED_COLUMN_NAME: serial_no
*************************** 3. row ***************************
           CONSTRAINT_CATALOG: NULL
            CONSTRAINT_SCHEMA: test
              CONSTRAINT_NAME: fk_to_a
                TABLE_CATALOG: NULL
                 TABLE_SCHEMA: test
                   TABLE_NAME: b
                  COLUMN_NAME: a_version
             ORDINAL_POSITION: 2
POSITION_IN_UNIQUE_CONSTRAINT: 2
      REFERENCED_TABLE_SCHEMA: test
        REFERENCED_TABLE_NAME: a
       REFERENCED_COLUMN_NAME: version
3 rows in set (0.01 sec)

Look at rows 2 and 3 in the results, then figure out how to get the exacts results you need. All you need is already there, including the manual...
[19 Mar 2008 0:20] Adrian Colomitchi
Thank, Valeriy

My bad, I apologize.
Thank you for the time spent in putting me on the right track.

Adrian