Bug #21755 Node recovery fails when table is dropped and id reused for index
Submitted: 21 Aug 2006 13:43 Modified: 15 May 2007 6:21
Reporter: Kristian Nielsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1 OS:Linux (Linux)
Assigned to: Jonas Oreland CPU Architecture:Any

[21 Aug 2006 13:43] Kristian Nielsen
Description:
[Found this while reading the NDB source]

Here is a scenario where node recovery may fail:

1. Node 2 stops/fails.
2. While node 2 is down, some table is dropped, and an index is created re-using the same table id.
3. Node 2 is restarted. During node recovery, the index is not recovered, causing node 2 to fail during restart.

How to repeat:
Using a compiled tree:

$ cd mysql-test

$ perl mysql-test-run.pl --start-and-exit ndb_basic
Servers started, exiting

$ ../client/mysql --socket=var/tmp/master.sock  -uroot test
mysql> create table a (a int primary key, b int) engine=ndbcluster;
mysql> create table b (a int primary key) engine=ndbcluster;
mysql> exit

$ ../storage/ndb/tools/ndb_show_tables 
id    type                 state    logging database     schema   name
3     UserTable            Online   Yes     cluster      def      NDB$BLOB_2_3
6     OrderedIndex         Online   No      sys          def      PRIMARY
1     SystemTable          Online   Yes     sys          def      NDB$EVENTS_0
5     UserTable            Online   Yes     test         def      a
7     UserTable            Online   Yes     test         def      b
1     IndexTrigger         Online   -                             NDB$INDEX_8_CUSTOM
2     UserTable            Online   Yes     cluster      def      schema
4     UserTable            Online   Yes     cluster      def      apply_status
8     OrderedIndex         Online   No      sys          def      PRIMARY
0     SystemTable          Online   Yes     sys          def      SYSTAB_0
0     IndexTrigger         Online   -                             NDB$INDEX_6_CUSTOM

NDBT_ProgramExit: 0 - OK

$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 stop'
$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status'
Connected to Management Server at: localhost:9310
Node 2: not connected

$ ../client/mysql --socket=var/tmp/master.sock  -uroot test
mysql> drop table b;
mysql> create unique index a_i on a(b);
mysql> exit

$ ../storage/ndb/tools/ndb_show_tables 
id    type                 state    logging database     schema   name
3     UserTable            Online   Yes     cluster      def      NDB$BLOB_2_3
1     IndexTrigger         Online   -                             NDB$INDEX_7_CUSTOM
6     OrderedIndex         Online   No      sys          def      PRIMARY
1     SystemTable          Online   Yes     sys          def      NDB$EVENTS_0
5     UserTable            Online   Yes     test         def      a
7     OrderedIndex         Online   No      sys          def      a_i
3     HashIndexTrigger     Online   -                             NDB$INDEX_8_UPDATE
2     HashIndexTrigger     Online   -                             NDB$INDEX_8_INSERT
2     UserTable            Online   Yes     cluster      def      schema
4     UserTable            Online   Yes     cluster      def      apply_status
8     UniqueHashIndex      Online   Yes     sys          def      a_i$unique
4     HashIndexTrigger     Online   -                             NDB$INDEX_8_DELETE
0     SystemTable          Online   Yes     sys          def      SYSTAB_0
0     IndexTrigger         Online   -                             NDB$INDEX_6_CUSTOM

NDBT_ProgramExit: 0 - OK

$ (cd var/ndbcluster-9310/ && ../../../storage/ndb/src/kernel/ndbd --no-defaults --core --character-sets-dir=../../../sql/share/charsets)
$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status'
Connected to Management Server at: localhost:9310
Node 2: not connected

$ cat var/ndbcluster-9310/ndb_2_error.log 
Current byte-offset of file-pointer is: 568                       

Time: Monday 21 August 2006 - 15:27:38
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dblqh/DblqhMain.cpp
Error object: DBLQH (Line: 10752) 0x0000000a
Program: ../../../storage/ndb/src/kernel/ndbd
Pid: 9568
Trace: /usr/local/mysql/mysql-5.1-tst/mysql-test/var/ndbcluster-9310/ndb_2_trace.log.1
Version: Version 5.1.12 (beta)
***EOM***

Suggested fix:
The problem is this code, in Dbdict::checkSchemaStatus():

    if(!::checkSchemaStatus(oldEntry->m_tableType, c_restartRecord.m_pass))
      continue;
    
    if(!::checkSchemaStatus(newEntry->m_tableType, c_restartRecord.m_pass))
      continue;

This checks if the table type in question is appropriate for the current recovery pass. However, if the table type on master is handled by a different pass than the table type in the local copy of the schema file, recovery for that table id will never take place.

Different table type should probably be dealt with similar to table ids where the types match, but the version differs.
[17 Apr 2007 14:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24670

ChangeSet@1.2466, 2007-04-17 16:43:47+02:00, jonas@perch.ndb.mysql.com +2 -0
  ndb - bug#21755
    redo checkSchemaState to handle bug (plus prepare to handle other DD related bugs in the area)
[24 Apr 2007 6:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25225

ChangeSet@1.2605, 2007-04-24 08:22:30+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - bug#21755
    table/index (es) need special treatment...
[26 Apr 2007 11:36] Bugs System
Pushed into 5.1.18-beta
[28 Apr 2007 19:35] Bugs System
Pushed into 5.1.18-beta
[30 Apr 2007 8:12] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Discussed bug with Jonas on IRC, documented fix in 5.1.18 changelog.
[14 May 2007 9:46] Jonas Oreland
note, pushed to 51-telco aswell.
not to telco-61 as patch is rather intrusive

not present in 4.1/5.0

---

I reset it to documenting (wo/ knowing if it's really needed)...

/Jonas
[15 May 2007 6:21] Jon Stephens
Given the above, I'm assuming that this fix won't be appearing in any other releases than 5.1.18 anytime soon - let me know if this isn't the case.