| Bug #21755 | Node recovery fails when table is dropped and id reused for index | ||
|---|---|---|---|
| Submitted: | 21 Aug 2006 13:43 | Modified: | 15 May 2007 6:21 |
| Reporter: | Kristian Nielsen | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
| Version: | 5.1 | OS: | Linux (Linux) |
| Assigned to: | Jonas Oreland | CPU Architecture: | Any |
[17 Apr 2007 14:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/24670 ChangeSet@1.2466, 2007-04-17 16:43:47+02:00, jonas@perch.ndb.mysql.com +2 -0 ndb - bug#21755 redo checkSchemaState to handle bug (plus prepare to handle other DD related bugs in the area)
[24 Apr 2007 6:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/25225 ChangeSet@1.2605, 2007-04-24 08:22:30+02:00, jonas@perch.ndb.mysql.com +1 -0 ndb - bug#21755 table/index (es) need special treatment...
[26 Apr 2007 11:36]
Bugs System
Pushed into 5.1.18-beta
[28 Apr 2007 19:35]
Bugs System
Pushed into 5.1.18-beta
[30 Apr 2007 8:12]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.
If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at
http://dev.mysql.com/doc/en/installing-source.html
Discussed bug with Jonas on IRC, documented fix in 5.1.18 changelog.
[14 May 2007 9:46]
Jonas Oreland
note, pushed to 51-telco aswell. not to telco-61 as patch is rather intrusive not present in 4.1/5.0 --- I reset it to documenting (wo/ knowing if it's really needed)... /Jonas
[15 May 2007 6:21]
Jon Stephens
Given the above, I'm assuming that this fix won't be appearing in any other releases than 5.1.18 anytime soon - let me know if this isn't the case.

Description: [Found this while reading the NDB source] Here is a scenario where node recovery may fail: 1. Node 2 stops/fails. 2. While node 2 is down, some table is dropped, and an index is created re-using the same table id. 3. Node 2 is restarted. During node recovery, the index is not recovered, causing node 2 to fail during restart. How to repeat: Using a compiled tree: $ cd mysql-test $ perl mysql-test-run.pl --start-and-exit ndb_basic Servers started, exiting $ ../client/mysql --socket=var/tmp/master.sock -uroot test mysql> create table a (a int primary key, b int) engine=ndbcluster; mysql> create table b (a int primary key) engine=ndbcluster; mysql> exit $ ../storage/ndb/tools/ndb_show_tables id type state logging database schema name 3 UserTable Online Yes cluster def NDB$BLOB_2_3 6 OrderedIndex Online No sys def PRIMARY 1 SystemTable Online Yes sys def NDB$EVENTS_0 5 UserTable Online Yes test def a 7 UserTable Online Yes test def b 1 IndexTrigger Online - NDB$INDEX_8_CUSTOM 2 UserTable Online Yes cluster def schema 4 UserTable Online Yes cluster def apply_status 8 OrderedIndex Online No sys def PRIMARY 0 SystemTable Online Yes sys def SYSTAB_0 0 IndexTrigger Online - NDB$INDEX_6_CUSTOM NDBT_ProgramExit: 0 - OK $ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 stop' $ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status' Connected to Management Server at: localhost:9310 Node 2: not connected $ ../client/mysql --socket=var/tmp/master.sock -uroot test mysql> drop table b; mysql> create unique index a_i on a(b); mysql> exit $ ../storage/ndb/tools/ndb_show_tables id type state logging database schema name 3 UserTable Online Yes cluster def NDB$BLOB_2_3 1 IndexTrigger Online - NDB$INDEX_7_CUSTOM 6 OrderedIndex Online No sys def PRIMARY 1 SystemTable Online Yes sys def NDB$EVENTS_0 5 UserTable Online Yes test def a 7 OrderedIndex Online No sys def a_i 3 HashIndexTrigger Online - NDB$INDEX_8_UPDATE 2 HashIndexTrigger Online - NDB$INDEX_8_INSERT 2 UserTable Online Yes cluster def schema 4 UserTable Online Yes cluster def apply_status 8 UniqueHashIndex Online Yes sys def a_i$unique 4 HashIndexTrigger Online - NDB$INDEX_8_DELETE 0 SystemTable Online Yes sys def SYSTAB_0 0 IndexTrigger Online - NDB$INDEX_6_CUSTOM NDBT_ProgramExit: 0 - OK $ (cd var/ndbcluster-9310/ && ../../../storage/ndb/src/kernel/ndbd --no-defaults --core --character-sets-dir=../../../sql/share/charsets) $ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status' Connected to Management Server at: localhost:9310 Node 2: not connected $ cat var/ndbcluster-9310/ndb_2_error.log Current byte-offset of file-pointer is: 568 Time: Monday 21 August 2006 - 15:27:38 Status: Temporary error, restart node Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug) Error: 2341 Error data: dblqh/DblqhMain.cpp Error object: DBLQH (Line: 10752) 0x0000000a Program: ../../../storage/ndb/src/kernel/ndbd Pid: 9568 Trace: /usr/local/mysql/mysql-5.1-tst/mysql-test/var/ndbcluster-9310/ndb_2_trace.log.1 Version: Version 5.1.12 (beta) ***EOM*** Suggested fix: The problem is this code, in Dbdict::checkSchemaStatus(): if(!::checkSchemaStatus(oldEntry->m_tableType, c_restartRecord.m_pass)) continue; if(!::checkSchemaStatus(newEntry->m_tableType, c_restartRecord.m_pass)) continue; This checks if the table type in question is appropriate for the current recovery pass. However, if the table type on master is handled by a different pass than the table type in the local copy of the schema file, recovery for that table id will never take place. Different table type should probably be dealt with similar to table ids where the types match, but the version differs.