MySQL Bugs: #8753: Invalid schema object version after dropping index

Bug #8753	Invalid schema object version after dropping index
Submitted:	23 Feb 2005 22:47	Modified:	11 Mar 2005 13:51
Reporter:	Dean Ellis	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	4.1.11	OS:
Assigned to:	Martin Skold	CPU Architecture:	Any

Description:
Dropping an index from one node causes invalid schema object errors on other nodes, especially if they are accessing the table at the time.

Possibly related to bug 6762, however in that bug retrying the query is a workaround; that does not work in this case.

How to repeat:
node1:

CREATE TABLE t1 ( a INT PRIMARY KEY, b VARCHAR(10), c VARCHAR(10), INDEX (b) ) ENGINE=NDB;
INSERT INTO t1 VALUES (1,'one','one'), (2,'two','two'), (3,'three','three');
CREATE INDEX c ON t1(c); 
ALTER TABLE t1 DROP INDEX c;

On node2, during the CREATE/ALTER statements above:

SELECT * FROM t1 WHERE b = 'two';
SELECT * FROM t1 WHERE b = 'two';
SELECT * FROM t1 WHERE b = 'two';
(etc)

Node 2 will begin reporting:

ERROR 1296 (HY000): Got error 241 'Invalid schema object version' from ndbcluster

Retrying the query does not solve error.

Suggested fix:
n/a

The handler has a stale cache referencing the old version of the index.
Invalidating all indexes as well as the table when schema error is returned
solves the problem.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/22713

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/22714

Patch pushed to mysql-4.1
Will be merged to mysql-5.0 and mysql-5.1

I had the same problem (with 4.1.9) and applied the two patches in this report (to the 4.1.10 sources). Things got worse: now mysqld crashes.

Thread 14 Crashed:
0   mysqld              0x00266b64 Vector<NdbColumnImpl*>::operator[](unsigned) + 0xc
1   mysqld              0x0025bbec NdbIndexScanOperation::readTuples(NdbOperation::LockMode, unsigned, unsigned, bool) + 0x7c
2   mysqld              0x0025bbec NdbIndexScanOperation::readTuples(NdbOperation::LockMode, unsigned, unsigned, bool) + 0x7c
3   mysqld              0x000f9fa0 ha_ndbcluster::ordered_index_scan(st_key_range const*, st_key_range const*, bool, char*) + 0x9c
4   mysqld              0x000e2dfc QUICK_SELECT::get_next() + 0x114
5   mysqld              0x000e4c64 rr_quick(st_read_record*) + 0xa8
6   mysqld              0x000add38 sub_select(JOIN*, st_join_table*, bool) + 0x80
7   mysqld              0x000ada0c do_select(JOIN*, List<Item>*, st_table*, Procedure*) + 0x25c
8   mysqld              0x000a5b94 JOIN::exec() + 0x106c
9   mysqld              0x000a5eb0 mysql_select(THD*, Item***, st_table_list*, unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st_
order*, unsigned long, select_result*, st_select_lex_unit*, st_select_lex*) + 0x1b4
10  mysqld              0x000a2f70 handle_select(THD*, st_lex*, select_result*) + 0x98
11  mysqld              0x00082010 mysql_execute_command(THD*) + 0x664
12  mysqld              0x00085e0c mysql_parse(THD*, char*, unsigned) + 0xd8
13  mysqld              0x00080b8c dispatch_command(enum_server_command, THD*, char*, unsigned) + 0x4f0
14  mysqld              0x0007fdd8 handle_one_connection + 0x2b0
15  libSystem.B.dylib   0x900246e8 _pthread_body + 0x28

This was after recreating the database and its tables with the new mysqld. Only a few inserts and selects led to the above crash. It's not 100% reproduceable, but I got it twice already. Can't manage to make a simple testcase from SQL statements though :(
Maybe it comes from the two simultaneous connections to the DB from two programs, which would explain why a single mysql command doesn't reproduce it...

I'm reporting this here because I also very often get:
query failed: select max(get_seqnum)+1 from _queues.tableA: Got error 241 'Invalid schema object version' from ndbcluster
which is what I was hoping those patches would fix.
So the problem isn't completely fixed... In my case it's not about dropping indexes though, but about dropping the whole table and recreating it.

Was the failure found in the exact same scenario, table, index definitions
and query, or was there something different?
The error I reproduced dissapeared with the fix.
The fix does not totally remove getting the error
Invalid schema object version, but it should only be gotten once
(since the objects are removed from the cache when error
was received).
Looks like you have a pointer to a deleted index object.
The global cache has reference counting, so there should not
be possible to delete an object (table or index) that is still in
use. Could be a problem with handler object recycling between
different threads. Are you reconnecting, or just running the same
two connections and interleaving SQL statements?

Have you verified this problem on a completely rebuilt binary from
the source tree, or only with your manually applied patches?

It was a different scenario (only create table, insert, select, drop table, and redoing it all once again, but with two tables and from two clients)

> The fix does not totally remove getting the error Invalid schema 
> object version, but it should only be gotten once (since the objects 
> are removed from the cache when error was received).
Yes I do get it only once - but even that is a bug, right? It certainly makes the application fail.

I am connecting from two clients, although the other client doesn't touch the table that ends up giving the error. But this could be why I can't reproduce the crash with a single client (e.g. mysql -e).

> Have you verified this problem on a completely rebuilt binary from
> the source tree, or only with your manually applied patches?
This is a completely rebuilt binary: initially I had binary packages for 4.1.9.
Then I wanted to apply this patch so I got the latest sources, 4.1.10, applied the two patches, and compiled everything. So I'm sure I'm using the patched one, since --version says 4.1.10, and it's a clean build. But I could be missing other changes that have been made since 4.1.10.  Thanks.

Tried creating two tables, inserting and selecting from
two clients, the dropping and do it again. Tried reconnecting
and also creating dropping index, but still cannot reproduce.
Do you have the exact sequence of operations?
I will try and make a fix, but since I cannot reproduce myself
it will be hard to verify it.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23294

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23296