Bug #21756 Node recovery missing version check for ALTER_TABLE_COMMITTED table
Submitted: 21 Aug 2006 14:03 Modified: 2 Nov 2006 8:54
Reporter: Kristian Nielsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:4.1,5.0,5.1 OS:Linux (Linux)
Assigned to: Jonas Oreland CPU Architecture:Any

[21 Aug 2006 14:03] Kristian Nielsen
Description:
The node recovery algorithm is missing a version check for tables in the ALTER_TABLE_COMMITTED state (as opposed to the TABLE_ADD_COMMITTED state which has the version check).

This can cause inconsistent schema across nodes after node recovery:

1. Table A is in altered state.
2. Node w shuts down/fails.
3. While node 2 is down, table A is altered again.
4. When node 2 recovers, it uses the old definition for A, while node 1 uses the new, correct definition.

How to repeat:
In a compiled tree:

$ perl mysql-test-run.pl --start-and-exit ndb_basic
Servers started, exiting

$ ../client/mysql --socket=var/tmp/master.sock  -uroot test
mysql> create table a(a int primary key) engine=ndb;
mysql> alter table a rename to b;
mysql> exit

$ ../storage/ndb/tools/ndb_show_tables 
id    type                 state    logging database     schema   name
3     UserTable            Online   Yes     cluster      def      NDB$BLOB_2_3
6     OrderedIndex         Online   No      sys          def      PRIMARY
1     SystemTable          Online   Yes     sys          def      NDB$EVENTS_0
5     UserTable            Online   Yes     test         def      b
2     UserTable            Online   Yes     cluster      def      schema
4     UserTable            Online   Yes     cluster      def      apply_status
0     SystemTable          Online   Yes     sys          def      SYSTAB_0
0     IndexTrigger         Online   -                             NDB$INDEX_6_CUSTOM

$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 stop'
Connected to Management Server at: localhost:9310
Node 2: Node shutdown initiated
Node 2: Node shutdown completed.
Node 2 has shutdown.
$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status'
Connected to Management Server at: localhost:9310
Node 2: not connected

$ ../client/mysql --socket=var/tmp/master.sock  -uroot test
mysql> alter table b rename to c;
mysql> exit

$ ../storage/ndb/tools/ndb_show_tables 
id    type                 state    logging database     schema   name
3     UserTable            Online   Yes     cluster      def      NDB$BLOB_2_3
6     OrderedIndex         Online   No      sys          def      PRIMARY
1     SystemTable          Online   Yes     sys          def      NDB$EVENTS_0
5     UserTable            Online   Yes     test         def      c
2     UserTable            Online   Yes     cluster      def      schema
4     UserTable            Online   Yes     cluster      def      apply_status
0     SystemTable          Online   Yes     sys          def      SYSTAB_0
0     IndexTrigger         Online   -                             NDB$INDEX_6_CUSTOM

$ (cd var/ndbcluster-9310/ && ../../../storage/ndb/src/kernel/ndbd --no-defaults --core --character-sets-dir=../../../sql/share/charsets)
$ ../storage/ndb/src/mgmclient/ndb_mgm -e '2 status'
Connected to Management Server at: localhost:9310
Node 2: started (Version 5.1.12)

knielsen@ymer:/usr/local/mysql/mysql-5.1-tst/mysql-test$ ../storage/ndb/src/mgmclient/ndb_mgm -e '1 stop'
Connected to Management Server at: localhost:9310
Node 1: Node shutdown initiated
Node 1: Node shutdown completed.
Node 1 has shutdown.

$ ../storage/ndb/src/mgmclient/ndb_mgm -e '1 status'
Connected to Management Server at: localhost:9310
Node 1: not connected

$ ../storage/ndb/tools/ndb_show_tables 
id    type                 state    logging database     schema   name
3     UserTable            Online   Yes     cluster      def      NDB$BLOB_2_3
6     OrderedIndex         Online   No      sys          def      PRIMARY
1     SystemTable          Online   Yes     sys          def      NDB$EVENTS_0
5     UserTable            Online   Yes     test         def      b
2     UserTable            Online   Yes     cluster      def      schema
4     UserTable            Online   Yes     cluster      def      apply_status
0     SystemTable          Online   Yes     sys          def      SYSTAB_0
0     IndexTrigger         Online   -                             NDB$INDEX_6_CUSTOM

[after recovering node 2 and stopping node 1, we see that node 2 has the wrong definition, the table has the old name 'b', not the correct new name 'c'].

Suggested fix:
In the function Dbdict::checkSchemaStatus(), the case for state TABLE_ADD_COMMITTED has this version check:

	if(newEntry->m_tableVersion == oldEntry->m_tableVersion)

A similar check is missing for state ALTER_TABLE_COMMITTED, and should be added.
[15 Sep 2006 9:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/12001

ChangeSet@1.2545, 2006-09-15 11:34:06+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - bug#21756
    Fix for alter table when node is down...that could cause pain and misery
[10 Oct 2006 18:40] Jonas Oreland
pushed into 5.1.12
[1 Nov 2006 14:27] Jonas Oreland
pushed into 4.1.22
[1 Nov 2006 14:43] Jonas Oreland
pushed into 5.0.29
[2 Nov 2006 8:54] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented fix for 4.1.22/5.0.29/5.1.12.
[4 Nov 2006 3:23] Jon Stephens
*Fix for 5.0 documented in 5.0.30 Release Notes.*