MySQL Bugs: #53580: abort() in multi-threaded index rebuild on node restart

Bug #53580	abort() in multi-threaded index rebuild on node restart
Submitted:	11 May 2010 17:40	Modified:	25 May 2010 9:45
Reporter:	Hartmut Holzgraefe	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-5.1-telco-6.2	OS:	Linux
Assigned to:	Jonas Oreland	CPU Architecture:	Any
Tags:	mysql-cluster-6.3.32

Description:
crash happens in Dbtux::mt_buildIndexFragment_wrapper() on line 49 in 
storage/ndb/src/kernel/blocks/dbtux/DbtuxBuild.cpp

 48     if (!(UintPtr(ptr) - UintPtr(req->mem_buffer) <= req->buffer_size))
 49       abort();

How to repeat:
...

Suggested fix:
?

first occurred after a power failure

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108990

3099 Jonas Oreland	2010-05-24
      ndb - bug#53580 - fix bug that caused alloc(#requested, #min) to sometimes allocate less than #min, causing later problems

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108992

3202 Jonas Oreland	2010-05-24
      ndb - bug#53580 - part II (>= 6.3) - ndbrequire that we got what we asked for during mtoib

Pushed into 5.1.44-ndb-6.3.34 (revid:jonas@mysql.com-20100524080154-74syl9t60ohrfl9j) (version source revid:jonas@mysql.com-20100524080154-74syl9t60ohrfl9j) (merge vers: 5.1.44-ndb-6.3.34) (pib:16)

Pushed into 5.1.44-ndb-7.0.15 (revid:jonas@mysql.com-20100524080447-jl195st9spefjway) (version source revid:jonas@mysql.com-20100524080447-jl195st9spefjway) (merge vers: 5.1.44-ndb-7.0.15) (pib:16)

DOCS: A bug in internal buddy allocator could make
"alloc(#wanted, #min)" which should try to allocate #wanted, but is allowed to allocate between #wanted-#min to allocate less than #min, causing problem during multi-threaded ordered index build.

Note: this could also theoretically(but unlikely) cause
problems in other areas of code.

pushed to 6.2.19, 6.3.34, 7.0.15 and 7.1.4

Documented bugfix in the NDB-6.2.19, 6.3.34, 7.0.15, and 7.1.4 changelogs, as follows:

      An internal buffer allocator used by NDB has the form 
      'alloc(*wanted*, *minimum*)' and attempts to allocate *wanted* 
      pages, but is permitted to allocate a smaller number of pages 
      between *wanted* and *minimum*. However, this allocator could 
      sometimes allocate fewer than *minimum* pages, causing problems 
      with multi-threaded builds of ordered indexes.

Closed.

Would this explain a rare bug I have seen?

Namely (an arbitrary example):

A unique hash index exists for column varchar column (say abc) NULL is allowed called index1. (as hash only, no btree NULL is fine).

A row exists where abc = 'hello world'

An update (made via mysqld) reset abc to NULL. (abc = NULL where abc = 'hello world').

Now you try to set another row to abc = 'hello world', you get duplicate key exists for '' now.

I wrote some code using the C++ NDB API and found the following.

I readTuple for index1, looking for abc = 'hello world'.

Firstly:

 * A row is found!! this should not be the case.
 * On reading the row, the value for abc = NULL!!

So the data is changed but the index has been updated.

Rebuilding the indexes fixes this but it's a bit of concern for apps, which uses these type of indexes.

I am using mysql cluster 7.0.13.

If this indeed would fix this bug, when is a General Available due? I see that 7.1.3 (latest GA) doesn't have it yet...

Many Thanks.

Ricky

p.s. If I can re-create this in lab, I'll update on how to replicate it.  It's a strange one, that the data gets updated to NULL but the index entry for it still exists cause duplicate error where there is none.