Bug #42789 | ndb test programs crash on Solaris 10 compiled with Sun Studio 12 | ||
---|---|---|---|
Submitted: | 12 Feb 2009 13:20 | Modified: | 9 Sep 2009 13:20 |
Reporter: | Guido Ostkamp | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Tests: Cluster | Severity: | S7 (Test Cases) |
Version: | mysql-5.1-telco-7.0 | OS: | Any |
Assigned to: | Jørgen Austvik | CPU Architecture: | Any |
Tags: | 6.4 -> 6.4.3 |
[12 Feb 2009 13:20]
Guido Ostkamp
[9 Mar 2009 16:41]
Maitrayi Sabaratnam
I could not reproduce the case for the specified version and compilor (executed the test 500 times in a loop, executed thru debugger). We need the info about the version the ndbapi application is compiled against. is it the same version as the binaries (Was the test program recompiled when the new binaries were taken into use? Any difference might have possibly caused the problem).
[11 Mar 2009 15:11]
Guido Ostkamp
Hello Maitrayi, it was all compiled from the same version. I just verified using the most current bazaar version frazer@mysql.com-20090309160754-r14u7v0om9ajnoii dated Mon 2009-03-09 16:07:54 +0000, that the bug still exists. I compile as outlined in earlier message. Then in .../storage/ndb/test/ndbapi, I did 'make flexAsynch'. I took the binary 'flexAsync' (not the shell script of equal name) as in storage/ndb/test/ndbapi/.libs and called it either directly or in dbx. It still bombs out $ dbx ./flexAsynch For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc Reading flexAsynch Reading ld.so.1 Reading libmtmalloc.so.1 Reading libndbclient.so.4.0.0 Reading libpthread.so.1 Reading libthread.so.1 Reading librt.so.1 Reading libgen.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading libm.so.2 Reading libCstd.so.1 Reading libCrun.so.1 Reading libc.so.1 Reading libaio.so.1 Reading libmd.so.1 (dbx) run -t 2 Running: flexAsynch -t 2 (process id 14354) Reading libc_psr.so.1 t@1 (l@1) signal SEGV (no mapping at the fault address) in NdbOut::endline at line 72 in file "NdbOut.cpp" 72 m_out->println(""); (dbx) where current thread: t@1 =>[1] NdbOut::endline(this = ???) (optimized), at 0xffffffff7f2fb680 (line ~72) in "NdbOut.cpp" [2] NdbOut::operator<<(this = ???, _f = ???) (optimized), at 0x100009b7c (line ~98) in "NdbOut.hpp" [3] main(argc = ???, argv = ???) (optimized), at 0x100004e3c (line ~195) in "flexAsynch.cpp" (dbx) quit $ ldd -r ./flexAsynch libndbclient.so.4 => /export/home/wsch/6.4_2009_01_29/lib/mysql/libndbclient.so.4 libpthread.so.1 => /lib/sparcv9/libpthread.so.1 libthread.so.1 => /lib/sparcv9/libthread.so.1 librt.so.1 => /lib/sparcv9/librt.so.1 libgen.so.1 => /lib/sparcv9/libgen.so.1 libsocket.so.1 => /lib/sparcv9/libsocket.so.1 libmtmalloc.so.1 => /usr/lib/sparcv9/libmtmalloc.so.1 libnsl.so.1 => /lib/sparcv9/libnsl.so.1 libm.so.2 => /lib/sparcv9/libm.so.2 libCstd.so.1 => /usr/lib/sparcv9/libCstd.so.1 libCrun.so.1 => /usr/lib/sparcv9/libCrun.so.1 libc.so.1 => /lib/sparcv9/libc.so.1 libaio.so.1 => /lib/64/libaio.so.1 libmd.so.1 => /lib/64/libmd.so.1 libmp.so.2 => /lib/64/libmp.so.2 libscf.so.1 => /lib/64/libscf.so.1 libdoor.so.1 => /lib/64/libdoor.so.1 libuutil.so.1 => /lib/64/libuutil.so.1 /platform/SUNW,Netra-T2000/lib/sparcv9/libc_psr.so.1 /platform/SUNW,Netra-T2000/lib/sparcv9/libmd_psr.so.1 Our running platform is installed in /export/home/wsch/6.4_2009_01_29/... so the path for libndbclient.so is ok. The other libraries are system libraries. This is SunOS pelton1 5.10 Generic_137111-08 sun4v sparc SUNW,Netra-T2000. Best regards Guido
[16 Mar 2009 16:16]
Maitrayi Sabaratnam
Hi I still think that the ndbclient library being linked is outdated (the ABI interface might have changed afterwords). There are 2 ways to verify this hypothesis: 1) run 'make install' to update the lib found in /export/home/wsch/6.4_2009_01_29/... 2) explicitely link the current (from your 11th of March version) library: - you can run the shell script flexAsync from ndbapi (This sets the correct LD_LIBRARY PATH before calling flexAsync) or - setting LD_LIBRARY_PATH to your currnt version's storage/ndb/src/.libs
[23 Mar 2009 14:37]
Maitrayi Sabaratnam
Need feedback for my comments from 17th of March.
[23 Apr 2009 7:51]
Guido Ostkamp
I have repeated the test with the current version revision-id: pekka@mysql.com-20090417190212-yifsmutw0fef59qc dated Fri 2009-04-17 22:02:12 +0300 after switching to new branch mysql-cluster-7.0. I used the ~/mysql_3rd/storage/ndb/test/ndbapi/flexAsync (the shell script) this time which automatically sets the LD_LIBRARY_PATH (as you suggested). The effect is still the same: $ cd /export/home/ostkamp/mysql_3rd/storage/ndb/test/ndbapi $ /flexAsynch -t 2 Segmentation Fault (core dumped) dbx .libs/flexAsynch /TspCore/core.flexAsynch.22988.1240472997 For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc Reading flexAsynch core file header read successfully Reading ld.so.1 Reading libmtmalloc.so.1 Reading libndbclient.so.4.0.0 Reading libpthread.so.1 Reading libthread.so.1 Reading librt.so.1 Reading libgen.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading libm.so.2 Reading libCstd.so.1 Reading libCrun.so.1 Reading libc.so.1 Reading libaio.so.1 Reading libmd.so.1 Reading libc_psr.so.1 t@1 (l@1) program terminated by signal SEGV (no mapping at the fault address) Current function is NdbOut::endline (optimized) 72 m_out->println(""); (dbx) where current thread: t@1 =>[1] NdbOut::endline(this = ???) (optimized), at 0xffffffff7f2fbe80 (line ~72) in "NdbOut.cpp" [2] NdbOut::operator<<(this = ???, _f = ???) (optimized), at 0x100009b7c (line ~98) in "NdbOut.hpp" [3] main(argc = ???, argv = ???) (optimized), at 0x100004e3c (line ~195) in "flexAsynch.cpp" (dbx) quit Please let me know if you need additional information. Regards Guido Ostkamp
[4 Sep 2009 12:41]
Jørgen Austvik
Also seen elsewhere: If you use NDB API to connect to a wrongly configured cluster, you can get a core dump instead of an error message. Code that connect to cluster: ---------8<------------------8<------------------8<------------------8<--------- ndb_init(); vector<Ndb_cluster_connection *> connections; for (long i = 0; i < threads; i++) { cout << "Connecting thread " << i << "..." << endl; Ndb_cluster_connection *conn = new Ndb_cluster_connection(connectString.c_str()); if (conn->connect(4, 5, 1)) { cout << "Unable to connect to cluster within 30 secs." << endl; exit(-1); } cout << "We have a connection, wait until cluster ready..." << endl; // Optionally connect and wait for the storage nodes (ndbd's) if (conn->wait_until_ready(30, 0) < 0) { std::cout << "Cluster was not ready within 30 secs.\n"; exit(-1); } connections.push_back(conn); } ---------8<------------------8<------------------8<------------------8<--------- With OK configuration this works fine, but on configuration errors, like "Configuration error: Error : Could not alloc node id at localhost port 1186: No free node id found for mysqld(API)", my NDB API client application core dumps: ---------8<------------------8<------------------8<------------------8<--------- Current function is NdbOut::operator<< 61 NdbOut::operator<<(const char* val){ m_out->print("%s", val ? val : "(null)"); return * this; } (dbx) print val val = 0x80a6aa0 "Configuration error: Error : Could not alloc node id at localhost port 1186: No free node id found for mysqld(API)." (dbx) print m_out m_out = (nil) (dbx) where current thread: t@1 =>[1] NdbOut::operator<<(this = 0xfef37ec4, val = 0x80a6aa0 "Configuration error: Error : Could not alloc node id at localhost port 1186: No free node id found for mysqld(API)."), line 61 in "NdbOut.cpp" [2] Ndb_cluster_connection_impl::connect(this = 0x80a6220, no_retries = 4, retry_delay_in_seconds = 5, verbose = 1), line 769 in "ndb_cluster_connection.cpp" [3] Ndb_cluster_connection::connect(this = 0x80b81f8, no_retries = 4, retry_delay_in_seconds = 5, verbose = 1), line 780 in "ndb_cluster_connection.cpp" [4] main(0xa, 0x8047494, 0x80474c0, 0x8053e48), at 0x8055403 ---------8<------------------8<------------------8<------------------8<---------
[8 Sep 2009 13:00]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82685 2983 Jorgen Austvik 2009-09-08 bug#42789: initialize ndbout
[8 Sep 2009 13:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82695 2987 Jorgen Austvik 2009-09-08 bug#42789: initialize ndbout
[8 Sep 2009 13:54]
Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jorgen.austvik@sun.com-20090908133858-dldnac6vc0fr6qiy) (version source revid:jorgen.austvik@sun.com-20090908133858-dldnac6vc0fr6qiy) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[8 Sep 2009 13:55]
Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jorgen.austvik@sun.com-20090908134812-7pnc0kkap433qbjy) (version source revid:jorgen.austvik@sun.com-20090908134812-7pnc0kkap433qbjy) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[8 Sep 2009 13:55]
Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:jorgen.austvik@sun.com-20090908135124-m74z86wuwaqsyzfi) (version source revid:jorgen.austvik@sun.com-20090908135124-m74z86wuwaqsyzfi) (merge vers: 5.1.35-ndb-7.1.0) (pib:11)
[8 Sep 2009 18:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82725
[8 Sep 2009 18:20]
Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090908181903-js6r7i1yzxyaqu9k) (version source revid:magnus.blaudd@sun.com-20090908181903-js6r7i1yzxyaqu9k) (merge vers: 5.1.35-ndb-7.1.0) (pib:11)
[9 Sep 2009 13:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82813 2992 Jonas Oreland 2009-09-09 ndb - bug#42789 reintroduce ndbouts_fileoutputstream allocated statically (but initialized by ndb_init) to avoid memory leak (as reported by valgrind) also, while i'm at it, create a NdbOut_Init() so that ndb_init() doesnt have to be so contaminated with NdbOut internals
[9 Sep 2009 13:20]
Jon Stephens
Test failure, no user-facing changes -> nothing to document; closed without taking further action.