Bug #46955 Slave thread starting before cluster storage engine is ready on mysqld restart
Submitted: 27 Aug 2009 13:46 Modified: 28 Oct 2009 17:03
Reporter: Hartmut Holzgraefe Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: 5.1.34-ndb-7.0.6-cluster-gpl

[27 Aug 2009 13:46] Hartmut Holzgraefe
Description:
Looks as if sometimes the replication SQL thread is started before the cluster storage engine is fully ready to apply change events received from the replication master.

See e.g. the mysqld error log snippet below where the SQL thread was started
while the cluster tables were still read only and only became writable a second later:

090826 15:45:35 [Note] /usr/local/mysql/sbin/mysqld: ready for connections.
Version: '5.1.34-ndb-7.0.6-cluster-gpl' socket: 'sqlnode.sock' port: 3307 MySQL Cluster Server (GPL)
090826 15:45:35 [Note] Slave SQL thread initialized, starting replication in log '.000004' at position 309905, relay log './sqlnode.000005' position: 244
090826 15:45:35 [Note] Slave I/O thread: connected to master 'ipworks@127.0.0.1:3306',replication started in log '.000004' at position 309905
090826 15:45:36 [Note] table './database_name/table_name' opened read only
090826 15:45:36 [ERROR] Slave SQL: Error 'Table 'table_name' is read only' on opening tables, Error_code: 1036
090826 15:45:36 [Warning] Slave: Table 'table_name' is read only Error_code: 1036
090826 15:45:36 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log '.000004'
position 309905
090826 15:45:36 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_schema
090826 15:45:36 [Note] NDB Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE)
090826 15:45:36 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_apply_status
090826 15:45:36 [Note] NDB Binlog: logging ./mysql/ndb_apply_status (UPDATED,USE_WRITE)
2009-08-26 15:45:37 [NdbApi] INFO -- Flushing incomplete GCI:s < 1422/18
2009-08-26 15:45:37 [NdbApi] INFO -- Flushing incomplete GCI:s < 1422/18
090826 15:45:37 [Note] NDB Binlog: starting log at epoch 1422/18
090826 15:45:37 [Note] NDB Binlog: ndb tables writable

How to repeat:
... no idea yet ...

Suggested fix:
Have the server check that all storage engines are fully initialized befor starting the slave SQL thread on a mysqld server start.
[31 Aug 2009 11:22] Hartmut Holzgraefe
How to reproduce:

- create a master and a slave cluster and set up replication (1 replication channel is sufficient)
- create 100 cluster tables on the master
- check that the slave has replicated all those create table event
- shut down the slave SQL node
- insert a row into one of the cluster tables on the master
- start the slave SQL node again

=> see how it tries to already apply replication events while the ndb tables are still read-only ...
[23 Oct 2009 14:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87968

3026 Frazer Clement	2009-10-23
      Bug#46955 Slave thread starting before cluster storage engine is ready on mysqld restart
      modified:
        mysql-test/include/have_multi_ndb.inc
        mysql-test/include/ndb_master-slave.inc
        mysql-test/include/ndb_master-slave_2ch.inc
        mysql-test/suite/ndb/my.cnf
        mysql-test/suite/ndb/t/ndb_discover_db.test
        mysql-test/suite/ndb_binlog/my.cnf
        mysql-test/suite/ndb_binlog/t/ndb_binlog_variants.cnf
        mysql-test/suite/ndb_team/my.cnf
        mysql-test/suite/rpl_ndb/my.cnf
        mysql-test/suite/rpl_ndb/t/disabled.def
        sql/ha_ndbcluster.cc
        sql/ha_ndbcluster_binlog.cc
        sql/mysql_priv.h
        sql/mysqld.cc
        sql/slave.cc
[27 Oct 2009 10:50] Frazer Clement
Pushed to 
6.2.19
6.3.28
7.0.9
7.1.0
[27 Oct 2009 13:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/88358

3034 Frazer Clement	2009-10-27
      Bug#46955 : Extend fix to libmysqld
      modified:
        libmysqld/lib_sql.cc
        mysql-test/include/have_multi_ndb.inc
[28 Oct 2009 17:03] Jon Stephens
This bug is a duplicate of BUG#40679, not the other way round.