Bug #49515 SIGHUP to mysqld with ndb causes SIGSEGV
Submitted: 7 Dec 2009 16:18 Modified: 9 Dec 2009 12:20
Reporter: Yoshinori Matsunobu Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.0.9 OS:Solaris
Assigned to: Jonas Oreland CPU Architecture:Any

[7 Dec 2009 16:18] Yoshinori Matsunobu
Description:
When sending SIGHUP to mysqld with ndbcluster and log-bin, mysqld crashes with SIGSEGV, instead of refreshing log files. 

Here is a stack trace I got.
-----------------  lwp# 29 / thread# 29  --------------------
 fffffd7ffed7cb8a _lwp_kill () + a
 0000000000a8d399 my_write_core () + 21
 00000000007b414e handle_segfault () + 1de
 fffffd7ffed77386 __sighndlr () + 6
 fffffd7ffed6bc32 call_user_handler () + 252
 fffffd7ffed6be4e sigacthandler (b, 0, fffffd7ffe90ea00) + de
 --- called from signal handler with signal 11 (SIGSEGV) ---
 0000000000970e3b __1cWndbcluster_binlog_wait6FpnDTHD__v_ () + 6b
 000000000097135b __1cVndbcluster_flush_logs6FpnKhandlerton__b_ () + 1b
 00000000008bc0ee __1cQflush_handlerton6FpnDTHD_pnNst_plugin_int_pv_c_ () + 1e
 0000000000941758 __1cYplugin_foreach_with_mask6FpnDTHD_pF1pnNst_plugin_int_pv_ciI4_b_ () + 4e8
 00000000008bc150 __1cNha_flush_logs6FpnKhandlerton__b_ () + 50
 00000000007cb81f __1cUreload_acl_and_cache6FpnDTHD_LpnKTABLE_LIST_pb_b_ () + 15f
 00000000007b4784 signal_hand () + 214
 fffffd7ffed7704b _thr_setup () + 5b
 fffffd7ffed77280 _lwp_start ()

reload_acl_and_cache is called here. 

    case SIGHUP:
      if (!abort_loop)
      {
        bool not_used;
	mysql_print_status();		// Print some debug info
	reload_acl_and_cache((THD*) 0,
			     (REFRESH_LOG | REFRESH_TABLES | REFRESH_FAST |
			      REFRESH_GRANT |
			      REFRESH_THREADS | REFRESH_HOSTS),
			     (TABLE_LIST*) 0, &not_used); // Flush logs

In ndbcluster_binlog_wait, it will get SIGSEGV if thd is zero. 

static void ndbcluster_binlog_wait(THD *thd)
{
  if (ndb_binlog_running)
  {
    DBUG_ENTER("ndbcluster_binlog_wait");
    ulonglong wait_epoch= ndb_get_latest_trans_gci();
    /*
      cluster not connected or no transactions done
      so nothing to wait for
    */
    if (!wait_epoch)
      DBUG_VOID_RETURN;
    const char *save_info= thd ? thd->proc_info : 0;
    int count= 30;
    if (thd)
      thd->proc_info= "Waiting for ndbcluster binlog update to "
        "reach current position";
    pthread_mutex_lock(&injector_mutex);
    while (!thd->killed && count && ndb_binlog_running &&
           (ndb_latest_handled_binlog_epoch == 0 ||
            ndb_latest_handled_binlog_epoch < wait_epoch))

How to repeat:
1. Run mysqld with ndbcluster and log-bin
2. kill -s HUP <pid of mysqld>

Suggested fix:
Strict null checking for THD is required in ndbcluster_binlog_wait.
[8 Dec 2009 8:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93124

3183 Jonas Oreland	2009-12-08
      ndb - bug#49515 - be careful in ndbcluster_binlog_wait, as THD might be null is called from signal handler
[8 Dec 2009 9:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93127

3184 Jonas Oreland	2009-12-08
      ndb - bug#49515 - be careful in ndbcluster_binlog_wait, as THD might be null is called from signal handler
[8 Dec 2009 9:14] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091208090936-qlflnt06fc37a1kw) (version source revid:jonas@mysql.com-20091208090936-qlflnt06fc37a1kw) (merge vers: 5.1.39-ndb-7.1.0) (pib:13)
[8 Dec 2009 9:19] Jonas Oreland
pushed to 6.3.29 and 7.0.10
[8 Dec 2009 13:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93199

3181 Martin Skold	2009-12-08 [merge]
      Merge
      modified:
        configure.in
        mysql-test/std_data/ndb_config_mycnf1.cnf
        mysql-test/suite/ndb/r/ndb_config.result
        mysql-test/suite/ndb/t/ndb_config.test
        sql/ha_ndbcluster_binlog.cc
        storage/ndb/src/kernel/blocks/dbdih/DbdihMain.cpp
        storage/ndb/src/kernel/blocks/dblqh/Dblqh.hpp
        storage/ndb/src/kernel/blocks/dblqh/DblqhInit.cpp
        storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp
        storage/ndb/src/ndbapi/Ndb.cpp
        storage/ndb/test/ndbapi/testBlobs.cpp
[9 Dec 2009 12:20] Jon Stephens
Documented bugfix int he NDB-6.3.29 and 7.0.10 changelogs as follows:

        Sending SIGHUP to a mysqld running with the --ndbcluster and
        --log-bin options caused the process to crash instead of
        refreshing its log files.

Closed.