Bug #41936 NDBMTD Segmentation fault in ThreadConfig::init
Submitted: 7 Jan 2009 23:32 Modified: 24 Feb 2009 22:58
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:mysql-5.1-telco-6.4 OS:Linux
Assigned to: CPU Architecture:Any

[7 Jan 2009 23:32] Jonathan Miller
Description:
Hi,

Was running im mixed mode where one node was a ndbd and the other was a ndbmtd.

The ndbmtd produced the following core:

(gdb) bt
#0  seize_buffer (rep=0xaf81a0, thr_no=<value optimized out>, prioa=true) at mt.cpp:911
#1  0x00000000006dda63 in ThreadConfig::init (this=<value optimized out>,
    emulatorData=0xaf7c60) at mt.cpp:2875
#2  0x00000000004813a8 in main (argc=4, argv=<value optimized out>) at main.cpp:595
(gdb) f 0
#0  seize_buffer (rep=0xaf81a0, thr_no=<value optimized out>, prioa=true) at mt.cpp:911
911           jb->m_len = 0;
(gdb) l
906         Uint32 batch = THR_FREE_BUF_MAX / THR_FREE_BUF_BATCH;
907         assert(batch > 0);
908         assert(batch + THR_FREE_BUF_MIN < THR_FREE_BUF_MAX);
909         do {
910           jb = rep->m_free_list.seize();
911           jb->m_len = 0;
912           jb->m_prioa = false;
913           first_free = (first_free ? first_free : THR_FREE_BUF_MAX) - 1;
914           selfptr->m_free_fifo[first_free] = jb;
915           batch--;
(gdb) f 1
#1  0x00000000006dda63 in ThreadConfig::init (this=<value optimized out>,
    emulatorData=0xaf7c60) at mt.cpp:2875
2875      thr_job_buffer *buffer = seize_buffer(rep, thr_no, true);
(gdb) l
2870      selfptr->m_first_unused = 0;
2871
2872      selfptr->m_jba_head.m_read_index = 0;
2873      selfptr->m_jba_head.m_write_index = 0;
2874      selfptr->m_jba.m_head = &selfptr->m_jba_head;
2875      thr_job_buffer *buffer = seize_buffer(rep, thr_no, true);
2876      selfptr->m_jba.m_buffers[0] = buffer;
2877      selfptr->m_jba_read_state.m_read_index = 0;
2878      selfptr->m_jba_read_state.m_read_buffer = buffer;
2879      selfptr->m_jba_read_state.m_read_pos = 0;
(gdb) f 2
#2  0x00000000004813a8 in main (argc=4, argv=<value optimized out>) at main.cpp:595
595       globalEmulatorData.theThreadConfig->init(&globalEmulatorData);
(gdb) l
590       }
591
592       if (get_multithreaded_config(globalEmulatorData))
593         return -1;
594
595       globalEmulatorData.theThreadConfig->init(&globalEmulatorData);
596
597     #ifdef VM_TRACE
598       // Create a signal logger before block constructors
599       char *buf= NdbConfig_SignalLogFileName(globalData.ownId);

How to repeat:
using ACRT, run 2-dn-mt-8-mixed

Note, execution threads was set to 8.
[15 Jan 2009 13:26] Jonas Oreland
possibly same as http://bugs.mysql.com/bug.php?id=42052
[26 Jan 2009 15:58] Jonathan Miller
Not the same as  http://bugs.mysql.com/bug.php?id=42052

mysql-5.1-telco-6.4 revno: 3224

Running a 2-dn configuration with a mix of ndbd and ndbmt I received a Program terminated with signal 11, Segmentation fault.

Back trace shows that globalEmulatorData.theThreadConfig had called init(&globalEmulatorData). Inside init, the program terminated at jb->m_len = 0;

925         Uint32 batch = THR_FREE_BUF_MAX / THR_FREE_BUF_BATCH;
926         assert(batch > 0);
927         assert(batch + THR_FREE_BUF_MIN < THR_FREE_BUF_MAX);
928         do {
929           jb = rep->m_free_list.seize();
930           jb->m_len = 0;

Program terminated with signal 11, Segmentation fault.
#0  ThreadConfig::init (this=<value optimized out>, emulatorData=<value optimized out>) at mt.cpp:930
930           jb->m_len = 0;
(gdb) bt
#0  ThreadConfig::init (this=<value optimized out>, emulatorData=<value optimized out>) at mt.cpp:930
#1  0x00000000004813c8 in main (argc=4, argv=<value optimized out>) at main.cpp:595

(gdb) f 0
#0  ThreadConfig::init (this=<value optimized out>, emulatorData=<value optimized out>) at mt.cpp:930
930           jb->m_len = 0;
(gdb) l
925         Uint32 batch = THR_FREE_BUF_MAX / THR_FREE_BUF_BATCH;
926         assert(batch > 0);
927         assert(batch + THR_FREE_BUF_MIN < THR_FREE_BUF_MAX);
928         do {
929           jb = rep->m_free_list.seize();
930           jb->m_len = 0;
931           jb->m_prioa = false;
932           first_free = (first_free ? first_free : THR_FREE_BUF_MAX) - 1;
933           selfptr->m_free_fifo[first_free] = jb;
934           batch--;
(gdb) f 1
#1  0x00000000004813c8 in main (argc=4, argv=<value optimized out>) at main.cpp:595
595       globalEmulatorData.theThreadConfig->init(&globalEmulatorData);
(gdb) l
590       }
591
592       if (get_multithreaded_config(globalEmulatorData))
593         return -1;
594
595       globalEmulatorData.theThreadConfig->init(&globalEmulatorData);
596
597     #ifdef VM_TRACE
598       // Create a signal logger before block constructors
599       char *buf= NdbConfig_SignalLogFileName(globalData.ownId);
[26 Jan 2009 15:59] Jonathan Miller
[atrt]
basedir=/data0/cr_autotest/run-2-dn-mt-8-mixed-mysql-5.1-telco-6.4/run
baseport=15000
clusters= .master
mt = 1

[ndb_mgmd]

[mysqld]
skip-grant-tables
skip-innodb
ndb_use_exact_count=0
loose-join_cache_level=6
loose-ndb-cluster-connection-pool=3
loose-ndb_extra_logging=9
loose-engine_condition_pushdown=1
loose-ndb_cache_check_time=1000

[cluster_config]
MaxNoOfExecutionThreads=8
MaxNoOfSavedMessages = 30
NoOfReplicas = 2
DataMemory = 8000M
IndexMemory = 1000M
DiskPageBufferMemory=300MB
DiskCheckpointSpeed=16M
RedoBuffer=200M
NoOfFragmentLogFiles=10
FragmentLogFileSize=512M
InitFragmentLogFiles=FULL
SharedGlobalMemory=384
SendBufferMemory = 2M
MaxNoOfConcurrentOperations = 250000
MaxNoOfLocalOperations = 275000
MaxNoOfConcurrentIndexOperations = 20000
MaxNoOfAttributes=2048
MaxNoOfOrderedIndexes=512
MaxNoOfUniqueHashIndexes=512
LockPagesInMainMemory=1
MemReportFrequency=200
LogLevelCongestion=15
LogLevelStatistic=15

[cluster_config.master]
ndb_mgmd = ndb21
ndbd = ndb21,ndb22
mysqld = ndb18
ndbapi=  ndb18,ndb18,ndb18,ndb18

[cluster_config.ndbd.1.master]
FileSystemPath=/data1/

[cluster_config.ndbd.2.master]
FileSystemPath=/data1/

#
# Generated by atrt
# Fri Jan 23 09:39:56 2009

[mysql_cluster.master]
ndb-connectstring= ndb21:15000

[cluster_config.ndb_mgmd.1.master]
PortNumber= 15000

[mysqld.1.master]
datadir= /data0/cr_autotest/run-2-dn-mt-8-mixed-mysql-5.1-telco-6.4/run/mysqld.1
socket= /data0/cr_autotest/run-2-dn-mt-8-mixed-mysql-5.1-telco-6.4/run/mysqld.1/mysql.sock
port= 15001
ndb-connectstring= ndb21:15000
ndbcluster

[client.1.master]
host= ndb18
socket= /data0/cr_autotest/run-2-dn-mt-8-mixed-mysql-5.1-telco-6.4/run/mysqld.1/mysql.sock
port= 15001
[27 Jan 2009 8:58] Jonas Oreland
duplicate of http://bugs.mysql.com/bug.php?id=42254 100% sure,
however, that hasnt been retested (to my knowledge) except by me

so keep this open, and close it as duplicate once retested
[29 Jan 2009 10:18] Jonas Oreland
setting status to QA testing,
as i consider it fixed
[24 Feb 2009 22:58] Jonathan Miller
Test is currently passing