Bug #20872 ndb_autodiscover3 fails randomly
Submitted: 5 Jul 2006 19:20 Modified: 15 Sep 2007 11:10
Reporter: Tomas Ulin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1 OS:Any
Assigned to: li zhou CPU Architecture:Any
Tags: pbfail

[5 Jul 2006 19:20] Tomas Ulin
Description:
ndb_autodiscover3              [ fail ]

Errors are (from /dev/shm/var-n_stm-5/log/mysqltest-time) :
mysqltest: At line 44: query 'select * from t2' failed with wrong errno 2003: 'Can't connect to MySQL server on '127.0.0.1' (111)', instead of 1146...
(the last lines may be the most important ones)
Result from queries before failure can be found in r/ndb_autodiscover3.log

How to repeat:
.
[7 Jul 2006 17:45] Tomas Ulin
should be ok now with other fixes, please reopen if still an issue
[6 Jul 2007 16:22] Timothy Smith
This may be related, and is happening currently.

The failure appears to show up here first (July 3, 2007): https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=mysql-5.1-new-ndb&order=518

A similar error did show up a few times a year ago (June 28, 2006), but then was not present for a year, so I discount that.

mysqltest: At line 103: query 'drop table t2' failed: 2013: Lost connection to MySQL server during query

I see a valgrind warning on ndb_autodiscover3 here: https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=mysql-5.1-marvel&order=63

I don't know if the valgrind warning is related to the other failure, but I suspect it's likely.
[6 Jul 2007 16:25] Timothy Smith
There are several similar stack traces; here is one of them, for easy reference:

VALGRIND: 'Invalid read of size 4'
    COUNT: 2
    FUNCTION: ndbcluster_free_share(st_ndbcluster_share**,    FILES:    master.err
    TESTS:    ndb_autodiscover3
    STACK: at 0x7D6828: ndbcluster_free_share(st_ndbcluster_share**, bool) (ha_ndbcluster.cc:8278)
             by 0x7D69DA: ha_ndbcluster::close() (ha_ndbcluster_binlog.h:214)
             by 0x679964: closefrm(st_table*, bool) (table.cc:1859)
             by 0x672B2B: intern_close_table(st_table*) (sql_base.cc:814)
             by 0x672BC7: free_cache_entry(st_table*) (sql_base.cc:833)
             by 0x9E2E6D: hash_delete (hash.c:527)
             by 0x66ABAC: close_thread_table(THD*, st_table**) (sql_base.cc:1275)
             by 0x66D87A: close_thread_tables(THD*, bool, bool) (sql_base.cc:1231)
             by 0x6B1E06: Prepared_statement::cleanup_stmt() (sql_prepare.cc:2800)
             by 0x6B4419: Prepared_statement::prepare(char const*, unsigned) (sql_prepare.cc:2922)
             by 0x6B4AD7: mysql_stmt_prepare(THD*, char const*, unsigned) (sql_prepare.cc:1934)
             by 0x639C32: dispatch_command(enum_server_command, THD*, char*, unsigned) (sql_parse.cc:881)
             by 0x63AA3C: do_command(THD*) (sql_parse.cc:668)
             by 0x62B815: handle_one_connection (sql_connect.cc:1094)
             by 0x4B2A192: start_thread (in /lib64/libpthread-2.4.so)
             by 0x51A147C: clone (in /lib64/libc-2.4.so)
           Address 0x8E0B358 is 440 bytes inside a block of size 576 free'd
             at 0x4A2066B: free (vg_replace_malloc.c:233)
             by 0x9D990A: my_no_flags_free (my_malloc.c:59)
             by 0x7D20EB: ndbcluster_real_free_share(st_ndbcluster_share**) (ha_ndbcluster.cc:8264)
             by 0x7D6C7B: handle_trailing_share(st_ndbcluster_share*) (ha_ndbcluster.cc:8013)
             by 0x7FE769: ndbcluster_create_binlog_setup(Ndb*, char const*, unsigned, char const*, char const*, char) (ha_ndbcluster_binlog.cc:2549)
             by 0x7E8384: ha_ndbcluster::create(char const*, st_table*, st_ha_create_information*) (ha_ndbcluster.cc:5051)
             by 0x72050C: ha_create_table_from_engine(THD*, char const*, char const*) (handler.cc:2686)
             by 0x7D639A: ndb_create_table_from_engine(THD*, char const*, char const*) (ha_ndbcluster.cc:6737)
             by 0x7F95DB: ndbcluster_setup_binlog_table_shares(THD*) (ha_ndbcluster_binlog.cc:862)
             by 0x7DF0B2: ndb_util_thread_func (ha_ndbcluster.cc:9068)
             by 0x4B2A192: start_thread (in /lib64/libpthread-2.4.so)
             by 0x51A147C: clone (in /lib64/libc-2.4.so)
[15 Jul 2007 10:40] Ingo Strüwing
The following appears in master.err:

master.err: ndb.ndb_autodiscover3: 070710 20:18:49 [ERROR] NDB: CREATE TABLE IF NOT EXISTS mysql.ndb_apply_status ( server_id INT UNSIGNED NOT NULL, epoch BIGINT UNSIGNED NOT NULL,  log_name VARCHAR(255) BINARY NOT NULL,  start_pos BIGINT UNSIGNED NOT NULL,  end_pos BIGINT UNSIGNED NOT NULL,  PRIMARY KEY USING HASH (server_id) ) ENGINE=NDB: error Table 'ndb_apply_status' already exists 1050(ndb: 0) 1 1
master.err: ndb.ndb_autodiscover3: 070710 20:18:49 [ERROR] Unable to get table share for ./mysql/ndb_apply_status, error=1

This appears in master1.err:
master1.err: ndb.ndb_autodiscover3: 070710 20:18:26 [ERROR] NDB: CREATE TABLE IF NOT EXISTS mysql.ndb_schema ( db VARBINARY(63) NOT NULL, name VARBINARY(63) NOT NULL, slock BINARY(32) NOT NULL, query BLOB NOT NULL, node_id INT UNSIGNED NOT NULL, epoch BIGINT UNSIGNED NOT NULL, id INT UNSIGNED NOT NULL, version INT UNSIGNED NOT NULL, type INT UNSIGNED NOT NULL, PRIMARY KEY USING HASH (db,name) ) ENGINE=NDB: error Can't create table 'mysql.ndb_schema' (errno: 156) 1005(ndb: 0) 1 1

Seen in several pushbuild warnings files. The above is from pb-valgrind.
A non-empty warnings file stops scripts and require manual action.

NOTE: I am going to disable ndb_autodiscover3. Pleas re-enable when these problems are fixed.
[30 Aug 2007 9:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33428

ChangeSet@1.2546, 2007-08-30 10:41:19+02:00, tomas@whalegate.ndb.mysql.com +2 -0
  Bug#20872 master*.err: miscellaneous error messages
[30 Aug 2007 9:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33430

ChangeSet@1.2548, 2007-08-30 11:46:30+02:00, tomas@whalegate.ndb.mysql.com +3 -0
  Bug#20872 master*.err: miscellaneous error messages
[5 Sep 2007 17:10] Tomas Ulin
test has been put into separate suite as not to disturb pushbuild

this failure is an edge case and there are more important bugs to fix
[14 Sep 2007 16:26] Bugs System
Pushed into 5.1.23-beta
[15 Sep 2007 11:10] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.22-ndb-6.2.5 and 5.1.23 changelogs as follows:

            An issue with the mysql.ndb_apply_status
            table could cause NDB schema
            autodiscovery to fail in certain rare circumstances.