| Bug #56890 | Calling typedef my_thread_id in ndbd.cpp | ||
|---|---|---|---|
| Submitted: | 21 Sep 2010 10:18 | Modified: | 23 Sep 2010 9:53 |
| Reporter: | Magnus Blåudd | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
| Version: | 7.0.19 | OS: | Any |
| Assigned to: | Magnus Blåudd | CPU Architecture: | Any |
[22 Sep 2010 9:56]
Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.47-ndb-6.3.38 (revid:magnus.blaudd@sun.com-20100922091646-jn4yk85oaufflxhi) (version source revid:magnus.blaudd@sun.com-20100922091646-jn4yk85oaufflxhi) (merge vers: 5.1.47-ndb-6.3.38) (pib:21)
[22 Sep 2010 9:56]
Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.47-ndb-7.0.19 (revid:magnus.blaudd@sun.com-20100922092208-uwlok5g3i83urdwt) (version source revid:magnus.blaudd@sun.com-20100922092208-uwlok5g3i83urdwt) (merge vers: 5.1.47-ndb-7.0.19) (pib:21)
[22 Sep 2010 11:27]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/118794
[22 Sep 2010 11:28]
Magnus Blåudd
Pushed to 6.3.38, 7.0.19 and 7.1.8
[22 Sep 2010 11:28]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/118800
[23 Sep 2010 9:53]
Jon Stephens
Documented in the NDB-6.3.38, 7.0.19, and 7.1.8 changelogs, as follows:
An error in program flow could result in data node shutdown
routines being called multiple times.
Closed.
[29 Sep 2010 10:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/119379 3288 Martin Skold 2010-09-29 [merge] Merge removed: cluster_change_hist.txt modified: mysql-test/collections/default.experimental mysql-test/suite/ndb/r/ndb_database.result mysql-test/suite/ndb/t/ndb_database.test sql/ha_ndbcluster.cc sql/ha_ndbcluster.h sql/ha_ndbcluster_binlog.cc sql/handler.cc sql/handler.h sql/sql_show.cc sql/sql_table.cc storage/ndb/include/kernel/GlobalSignalNumbers.h storage/ndb/include/kernel/signaldata/FsReadWriteReq.hpp storage/ndb/include/mgmapi/mgmapi.h storage/ndb/include/ndbapi/NdbDictionary.hpp storage/ndb/src/kernel/blocks/ERROR_codes.txt storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp storage/ndb/src/kernel/blocks/dbdih/DbdihMain.cpp storage/ndb/src/kernel/blocks/dblqh/Dblqh.hpp storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp storage/ndb/src/kernel/blocks/dbtup/DbtupIndex.cpp storage/ndb/src/kernel/blocks/dbtup/DbtupMeta.cpp storage/ndb/src/kernel/blocks/dbtux/Dbtux.hpp storage/ndb/src/kernel/blocks/dbtux/DbtuxBuild.cpp storage/ndb/src/kernel/blocks/dbtux/DbtuxMaint.cpp storage/ndb/src/kernel/blocks/dbtux/DbtuxNode.cpp storage/ndb/src/kernel/blocks/dbtux/DbtuxTree.cpp storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.hpp storage/ndb/src/kernel/blocks/ndbfs/Ndbfs.cpp storage/ndb/src/kernel/blocks/ndbfs/Ndbfs.hpp storage/ndb/src/kernel/blocks/ndbfs/VoidFs.cpp storage/ndb/src/kernel/blocks/suma/Suma.cpp storage/ndb/src/kernel/blocks/suma/Suma.hpp storage/ndb/src/kernel/main.cpp storage/ndb/src/ndbapi/DictCache.cpp storage/ndb/src/ndbapi/DictCache.hpp storage/ndb/src/ndbapi/NdbDictionary.cpp storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp storage/ndb/src/ndbapi/NdbDictionaryImpl.hpp storage/ndb/test/include/NdbRestarter.hpp storage/ndb/test/ndbapi/testIndex.cpp storage/ndb/test/ndbapi/testRestartGci.cpp storage/ndb/test/ndbapi/testSystemRestart.cpp storage/ndb/test/run-test/daily-basic-tests.txt storage/ndb/test/src/NdbRestarter.cpp

Description: We are calling a typedef in ndbd.cpp :) In my_pthread.h: typedef ulong my_thread_id; and in ndbd.cpp handler_error(int signum) { static long thread_id = 0; if (thread_id != 0 && thread_id == my_thread_id()) <<< { // Shutdown thread received signal kill own process enter endless loop } thread_id = my_thread_id(); <<<< } It's interesting too see that this code compiles.:) Some printouts shows below that zero is assigned to thread_id which means the guard against getting a signal in signal handler does not work. handler_error, signum: 6 thread_id: 0 thread_id: 0 2010-09-21 12:48:21 [ndbd] INFO -- Received signal 6. Running error handler. 2010-09-21 12:48:21 [ndbd] INFO -- Signal 6 received; Aborted 2010-09-21 12:48:21 [ndbd] INFO -- ndbd.cpp How to repeat: Manual code inspection and printouts. Suggested fix: OK, so how to solve this so that only one shutdown takes place? 1) Install the default signal handler for all signals as first step in 'handler_error', this would mean that if the shutdown code triggers another signal handler, the process would exit() or abort() - meaning we might get a core indicating where the second problem occured. 2) Using the already existing theShutdownMutex. We should move all code for it into ndbd.cpp and let the creation and destruction be handled in the same place where it's used. This will make it possible to create the mutex before allowing shutodwn or installing the signal handler, thus removing uncertainty if the mutex is created or not and thus removing the need for knowing which thread id is running handler_error or shutdown. I think we could actually put the mutex with file storage and remove the need for it to be created/destroyed.