Bug #36745 | falcon crash on solaris | ||
---|---|---|---|
Submitted: | 15 May 2008 20:52 | Modified: | 30 Sep 2008 19:31 |
Reporter: | Neel Nadgir | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0.5 | OS: | Solaris |
Assigned to: | Olav Sandstå | CPU Architecture: | Any |
Tags: | SPARC 64bit |
[15 May 2008 20:52]
Neel Nadgir
[20 May 2008 13:48]
Sveta Smirnova
Thank you for the report. I can not repeat described behavior in my environment. Can you repeat crast if run mysqld as `./libexec/mysqld --bootstrap --basedir=. --datadir=./data --log-warnings=0 --loose-skip-innodb --loose-skip-ndbcluster --max_allowed_packet=8M --net_buffer_length=16K` without help of mysql_install_db script?
[20 May 2008 18:23]
Neel Nadgir
I tried as you suggested and still got the same crash (same callstack) === ted@frost ~/tools/mysql-6.0.5-falcon$ ./libexec/mysqld --bootstrap --basedir=. --datadir=/data/mysql-falcon --log-warnings=0 --loose-skip-innodb --loose-skip-ndbcluster --max_allowed_packet=8M --net_buffer_length=16K 080520 11:21:17 - mysqld got signal 10 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=8388600 read_buffer_size=131072 max_used_connections=0 max_threads=151 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338287 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Fatal signal 6 while backtracing ===
[21 May 2008 16:36]
Kevin Lewis
Olav Sandtraa, Please work with PAE to isolate this. It could be an endian issue.
[22 May 2008 22:18]
Neel Nadgir
It turns out that this is a memory alignment issue. I fixed the crash by changing the following 1. Change size of DeferredIndex::initialSpace from 500 to 512. This ensures that on 64 bit platforms, it is aligned at 8 byte boundaries. 2. in SymbolManger.cpp, there is some magic to roundup the pointer size. (ex line 127: symbol = (Symbol*) ((IPTR)(next + 3) & ~3); ) I suggest this (and similar occurrences) be changed to use ROUNDUP() macro in Engine.h i.e symbol = (Symbol*) ROUNDUP(next, sizeof (void *)) if this is going to be part of the ondisk format, need to be more careful. I do not know if it is a design goal of falcon/mysql that databases created with 64bit mysql are accessible via 32bit mysql..
[22 May 2008 23:07]
Olav Sandstå
Hi Neel, Thanks for looking into the cause for this problem and for providing a fix to it. I will work to get your fix checked in to the Falcon repository. Regards, Olav
[28 May 2008 20:39]
Jim Starkey
The issue in SymbolManager is real and should be fixed. The comment about DeferredIndex initial allocation, however, is not. All blocks allocated from MemMgr are on 8 byte or greater boundaries. And neither the SymbolManager or DeferredIndex have anything to do with the ODS.
[28 May 2008 20:59]
Neel Nadgir
The issue with DeferredIndex::initialSpace was that currentHunkOffset is initialized to 500 initially. currentHunkOffset = sizeof(initialSpace); in DeferredIndex::initializeSpace For ex, if we alloc 8 bytes. DeferredIndex::alloc() will return base + (currentHunkOffset - 8) ie. base + 500 - 8 i.e base + 492 which is not 8 byte aligned. and thus the crash.
[4 Jun 2008 13:03]
Olav Sandstå
Just for the record: here is the same stack trace as reported by Neel but with symbol information: Running: mysqld --bootstrap --basedir=. --datadir=./data --log-warnings=0 --loose-skip-innodb --loose-skip-ndbcluster --max_allowed_packet=8M --net_buffer_length=16K (process id 13650) Reading libc_psr.so.1 080604 12:50:25 [ERROR] Can't find messagefile '/home/os136802/mysql/develop/repo/mysql-6.0-falcon-sunstudio/share/mysql/english/errmsg.sys' t@1 (l@1) signal BUS (invalid address alignment) in SymbolManager::getSymbol at line 144 in file "SymbolManager.cpp" 144 symbol->collision = hashTable [slot]; (dbx) where current thread: t@1 =>[1] SymbolManager::getSymbol(this = 0x10119e1f0, string = 0x100b0c001 "SYSTEM"), line 144 in "SymbolManager.cpp" [2] Database::getSymbol(this = 0x10101ae70, string = 0x100b0c001 "SYSTEM"), line 1908 in "Database.cpp" [3] RoleModel::RoleModel(this = 0x1011a0620, db = 0x10101ae70), line 109 in "RoleModel.cpp" [4] Database::start(this = 0x10101ae70), line 508 in "Database.cpp" [5] Database::createDatabase(this = 0x10101ae70, filename = 0xffffffff7fffe088 "falcon_master.fts"), line 641 in "Database.cpp" [6] Connection::createDatabase(this = 0x10121a410, dbName = 0x10121a144 "FALCON_MASTER", fileName = 0x10121a174 "falcon_master.fts", account = 0x100b11cd4 "mysql", password = 0x100b11cda "mysql", threads = 0x10121a1a0), line 1066 in "Connection.cpp" [7] StorageDatabase::createDatabase(this = 0x101219fd8), line 158 in "StorageDatabase.cpp" [8] StorageHandler::initialize(this = 0x10101a328), line 996 in "StorageHandler.cpp" [9] StorageInterface::falcon_init(p = 0x101c38730), line 207 in "ha_falcon.cpp" [10] ha_initialize_handlerton(plugin = 0x101c31b58), line 428 in "handler.cc" [11] plugin_initialize(plugin = 0x101c31b58), line 1011 in "sql_plugin.cc" [12] plugin_init(argc = 0x100d9dbe8, argv = 0x10141d828, flags = 2), line 1217 in "sql_plugin.cc" [13] init_server_components(), line 3975 in "mysqld.cc" [14] main(argc = 9, argv = 0xffffffff7ffff288), line 4407 in "mysqld.cc"
[6 Jun 2008 12:34]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47530 2692 Olav Sandstaa 2008-06-06 Fix to one of the memory alignment issues in Bug#36745 falcon crash on solaris The problem was a memory alignment issue in SymbolManager.cpp that caused a crash when running in 64 bit mode on Solaris on SPARC. Fixed this by ensuring the symbol objects are allocated at 8 byte boundaries when compiling for 64 bit systems.
[7 Jun 2008 18:09]
Olav Sandstå
Here is the call stack for the second BUS (invalid address alignment) error: t@1 (l@1) signal BUS (invalid address alignment) in DeferredIndex::addNode at line 196 in file "DeferredIndex.cpp" 196 leaf->nodes[leaf->count++] = node; (dbx) where current thread: t@1 =>[1] DeferredIndex::addNode(this = 0x10122cde8, indexKey = 0xffffffff7fffa6c8, recordNumber = 0), line 196 in "DeferredIndex.cpp" [2] Index::insert(this = 0x10121fa60, key = 0xffffffff7fffa6c8, recordNumber = 0, transaction = 0x101228810), line 212 in "Index.cpp" [3] Index::insert(this = 0x10121fa60, record = 0x10495b000, transaction = 0x101228810), line 206 in "Index.cpp" [4] Table::insertIndexes(this = 0x1011b9f60, transaction = 0x101228810, record = 0x10495b000), line 1218 in "Table.cpp" [5] Table::insert(this = 0x1011b9f60, transaction = 0x101228810, count = 8, fieldVector = 0x10121d760, values = 0x10121d510), line 358 in "Table.cpp" [6] NInsert::evalStatement(this = 0x10122c320, statement = 0x10121cfc8), line 143 in "NInsert.cpp" [7] Nfs::Statement::start(this = 0x10121cfc8, node = 0x10122c320), line 487 in "Statement.cpp" [8] PreparedStatement::executeUpdate(this = 0x10121cfc8), line 86 in "PreparedStatement.cpp" [9] Table::save(this = 0x1011b9f60), line 258 in "Table.cpp" [10] Database::createDatabase(this = 0x10101be20, filename = 0xffffffff7fffdf98 "falcon_master.fts"), line 658 in "Database.cpp" [11] Connection::createDatabase(this = 0x10121b3c0, dbName = 0x10121b0f4 "FALCON_MASTER", fileName = 0x10121b124 "falcon_master.fts", account = 0x100afc26c "mysql", password = 0x100afc272 "mysql", threads = 0x10121b150), line 1066 in "Connection.cpp" [12] StorageDatabase::createDatabase(this = 0x10121af88), line 158 in "StorageDatabase.cpp" [13] StorageHandler::initialize(this = 0x10101b2d8), line 996 in "StorageHandler.cpp" [14] StorageInterface::falcon_init(p = 0x101c396e0), line 207 in "ha_falcon.cpp" [15] ha_initialize_handlerton(plugin = 0x101c32b08), line 428 in "handler.cc" [16] plugin_initialize(plugin = 0x101c32b08), line 1011 in "sql_plugin.cc" [17] plugin_init(argc = 0x100d9eb98, argv = 0x10141e7d8, flags = 2), line 1217 in "sql_plugin.cc" [18] init_server_components(), line 3975 in "mysqld.cc" [19] main(argc = 9, argv = 0xffffffff7ffff198), line 4407 in "mysqld.cc"
[8 Jun 2008 6:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47579 2694 Olav Sandstaa 2008-06-08 Fix to the second of the memory alignment issues in Bug#36745 falcon crash on solaris The problem that caused crashes on SPARC running in 64 bit mode was due a memory aligment issue when "nodes" were allocated in the DeferredIndex::initialSpace. Memory is allocated in the initialSpace starting from the end of the memory area. Since the end of the initialSpace was not aligned on an address boundary this resulted in that memory allocated from it was not aligned and thus crashed on Solaris when running in 64 bit mot on SPARC. Fixed this by ensuring that the size of the initialSpace is "8 byte memory aligned" (increased the size of it from 500 bytes to 512 bytes).
[30 Sep 2008 19:31]
Jon Stephens
Documented in the 6.0.6 changelog as follows: mysql_install_db from a Falcon-enabled build crashed on Solaris/SPARC.