Bug #30364 | Falcon: running falcon_concurrent_blob_updates leads to assertion | ||
---|---|---|---|
Submitted: | 10 Aug 2007 19:06 | Modified: | 16 Aug 2007 18:13 |
Reporter: | Hakan Küçükyılmaz | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S3 (Non-critical) |
Version: | 6.0.2 | OS: | Linux (64-bit) |
Assigned to: | Kevin Lewis | CPU Architecture: | Any |
[10 Aug 2007 19:06]
Hakan Küçükyılmaz
[11 Aug 2007 17:56]
MySQL Verification Team
Thank you for the bug report. 070811 14:53:18 [ERROR] /home/miguel/dbs/6.0/libexec/mysqld: Sort aborted Program received signal SIGILL, Illegal instruction. [Switching to Thread 1186838848 (LWP 16235)] 0x00002af8c8be8fdd in raise () from /lib64/libpthread.so.0 (gdb) bt full #0 0x00002af8c8be8fdd in raise () from /lib64/libpthread.so.0 No symbol table info available. #1 0x00000000009393c0 in Error::debugBreak () at Error.cpp:92 No locals. #2 0x00000000009394d4 in Error::error (string=0xd7a028 "assertion failed at line %d in file %s\n") at Error.cpp:69 buffer = "assertion failed at line 508 in file Record.cpp\n\000\221�F", '\0' <repeats 44 times>, "�\220�F\000\000\000\000�\220�F\000\000\000\000�\220�F\000\000\000\000�\220�F\000\000\000\000\b\221�F", '\0' <repeats 12 times>, "\020\221�F\000\000\000\000�\210\217", '\0' <repeats 29 times>, "\220�j��*", '\0' <repeats 42 times>, "0�\000��*\000\000\001", '\0' <repeats 14 times> args = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x46bd91d0, reg_save_area = 0x46bd9100}} #3 0x000000000093956b in Error::assertionFailed (fileName=0xd7f2e0 "Record.cpp", line=508) at Error.cpp:76 No locals. #4 0x0000000000970357 in Record::release (this=0x2af8c9f594e8) at Record.cpp:508 No locals. #5 0x00000000008e6419 in StorageDatabase::nextRow (this=0x2af8c9dfa1c8, storageTable=0x2af8c9c79788, recordNumber=894, lockForUpdate=true) at StorageDatabase.cpp:278 exception = (SQLException &) @0x2aaab000ea50: {_vptr.SQLException = 0xd83050} storageConnection = (class StorageConnection *) 0x2af8c9ed6070 connection = (class Connection *) 0x2af8c9ed6108 table = (class Table *) 0x2af8c9e52108 transaction = (class Transaction *) 0x2af8c9f55c10 candidate = (class Record *) 0x2af8c9f594e8 record = (class Record *) 0x0 #6 0x00000000008ebc21 in StorageTable::next (this=0x2af8c9c79788, recordNumber=0, lockForUpdate=true) at StorageTable.cpp:126 ret = 0 #7 0x00000000008df65e in StorageInterface::rnd_next (this=0x1d65eb8, buf=0x1d660f8 "�K*��") at ha_falcon.cpp:480 _db_func_ = 0xd3536f "mysql_update" _db_file_ = 0xd35146 "sql_update.cc" _db_level_ = 5 _db_framep_ = (char **) 0x46bd9340 #8 0x00000000007d0865 in rr_sequential (info=0x46bd9560) at records.cc:363 tmp = 0 #9 0x00000000007686c7 in mysql_update (thd=0x1fa1f38, table_list=0x203ee18, fields=@0x1fa3a40, values=@0x1fa3e18, conds=0x0, order_num=0, order=0x0, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:558 using_limit = false ---Type <return> to continue, or q <return> to quit--- safe_update = false used_key_is_modified = false transactional_table = true will_batch = false can_compare_record = true res = 0 error = 0 loc_error = 1186830016 used_index = 64 dup_key_found = 0 need_sort = true want_privilege = 0 table_count = 1 updated = 0 found = 0 old_covering_keys = {map = 0} table = (TABLE *) 0x1f9d5f8 select = (SQL_SELECT *) 0x0 info = {table = 0x1f9d5f8, file = 0x1d65eb8, forms = 0x46bd9560, read_record = 0x7d07ea <rr_sequential(st_read_record*)>, thd = 0x1fa1f38, select = 0x0, cache_records = 0, ref_length = 4, struct_length = 0, reclength = 0, rec_cache_size = 0, error_offset = 0, index = 0, ref_pos = 0x0, record = 0x1d660f8 "�K*��", rec_buf = 0x0, cache = 0x0, cache_pos = 0x0, cache_end = 0x0, read_positions = 0x0, io_cache = 0x0, print_error = true, ignore_not_found_rows = false} select_lex = (SELECT_LEX *) 0x1fa3938 need_reopen = false id = 33174896 all_fields = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x1253380, last = 0x46bd9630, elements = 0}, <No data fields>} _db_func_ = 0xcdcb10 "mysql_execute_command" _db_file_ = 0xcdc4b0 "sql_parse.cc" _db_level_ = 4 _db_framep_ = (char **) 0x1fa2da0 __PRETTY_FUNCTION__ = "int mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, COND*, uint, ORDER*, ha_rows, enum_duplicates, bool)" #10 0x00000000006cbc90 in mysql_execute_command (thd=0x1fa1f38) at sql_parse.cc:2592 ---Type <return> to continue, or q <return> to quit--- res = false need_start_waiting = false up_result = 0 lex = (LEX *) 0x1fa34d0 select_lex = (SELECT_LEX *) 0x1fa3938 first_table = (TABLE_LIST *) 0x203ee18 all_tables = (TABLE_LIST *) 0x203ee18 unit = (SELECT_LEX_UNIT *) 0x1fa3570 _db_func_ = 0xcdd2a0 "mysql_parse" _db_file_ = 0xcdc4b0 "sql_parse.cc" _db_level_ = 3 _db_framep_ = (char **) 0x20034d0 __PRETTY_FUNCTION__ = "bool mysql_execute_command(THD*)" #11 0x00000000006d1e57 in mysql_parse (thd=0x1fa1f38, inBuf=0x203ed58 "UPDATE t1 SET c3 = md5(c3)", length=26, found_semicolon=0x46bdaee0) at sql_parse.cc:5381 lex = (LEX *) 0x1fa34d0 lip = {m_thd = 0x1fa1f38, yylineno = 1, yytoklen = 2, yylval = 0x46bd8fd0, ptr = 0x203ed73 '\217' <repeats 93 times>, "\2003%\001", tok_start = 0x203ed73 '\217' <repeats 93 times>, "\2003%\001", tok_end = 0x203ed73 '\217' <repeats 93 times>, "\2003%\001", end_of_query = 0x203ed72 "", tok_start_prev = 0x203ed72 "", buf = 0x203ed58 "UPDATE t1 SET c3 = md5(c3)", next_state = MY_LEX_END, found_semicolon = 0x0, ignore_space = false, stmt_prepare_mode = false} err = 0 _db_func_ = 0xcdd31b "dispatch_command" _db_file_ = 0xcdc4b0 "sql_parse.cc" _db_level_ = 2 _db_framep_ = (char **) 0x3 __PRETTY_FUNCTION__ = "void mysql_parse(THD*, const char*, uint, const char**)" #12 0x00000000006d2b91 in dispatch_command (command=COM_QUERY, thd=0x1fa1f38, packet=0x2aaab00008e9 "UPDATE t1 SET c3 = md5(c3)", packet_length=27) at sql_parse.cc:907 packet_end = 0x203ed72 "" format = 0xcdd32f "%.*b" found_semicolon = 0x0 net = (NET *) 0x1fa2008 error = false ---Type <return> to continue, or q <return> to quit--- _db_func_ = 0xddd1b4 "?func" _db_file_ = 0xddd1ba "?file" _db_level_ = 1 _db_framep_ = (char **) 0x46bdb000 #13 0x00000000006d3f38 in do_command (thd=0x1fa1f38) at sql_parse.cc:669 packet = 0x2aaab00008e8 "\003UPDATE t1 SET c3 = md5(c3)" packet_length = 27 net = (NET *) 0x1fa2008 command = COM_QUERY _db_func_ = 0xddd1b4 "?func" _db_file_ = 0xddd1ba "?file" _db_level_ = 1 _db_framep_ = (char **) 0x46bdb0a0 #14 0x00000000006c20e6 in handle_one_connection (arg=0x1fa1f38) at sql_connect.cc:1091 net = (NET *) 0x1fa2008 thd = (class THD *) 0x1fa1f38 launch_time = 0 #15 0x00002af8c8be109e in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #16 0x00002af8c99764cd in clone () from /lib64/libc.so.6 No symbol table info available. #17 0x0000000000000000 in ?? () No symbol table info available. (gdb)
[14 Aug 2007 4:39]
Kevin Lewis
I was able to compile and execute falcon_concurrent_blob_updates on Windows by creating a Visual Studio solution and changing the .c file in several ways. Now I can reproduce this bug in Windows. The ASSERT happens after about 10 concurrent threads have started up. There are two call stacks. One is the release()at StorageDatabase.cpp:278 in the function StorageDatabase::nextRow. It is always associated with a call to fetchForUpdate which fails with an error code that the "Record has changed since last read in table". The SQL statement is an Update. The transaction is RepeatableRead. The ASSERT can also happen at the release() called from fetchForUpdate(), Table.cpp, line 3085. The state is 'WasActive' indicating that the thread just waited on another transaction. In both cases the ASSERT fails because the RecordVersion referenced has been released and the memory is no longer reliable. I am adding some code to gather more info and isolate the reason fro the premature record destruction.
[15 Aug 2007 3:40]
Kevin Lewis
This bug is about an ASSERT(useCount > 0); in Record::release(). This would happen consistently when running falcon_concurrent_blob_updates with at least 10 concurrent threads. I converted this test program to Windows and repeated it dozens of times while adding debug code to track the history of each Record object. A record was being released while there still was a candidate pointer to it in either Table::fetchForUpdate or one of the three functions that call it. It is only seen with a high level of concurrency. Table::fetchForUpdate is intended to release the refCount of the candidate sent in if NULL is returned. But nothing is released if an exception occurs. Normally, the candiate record reference would then be released in the catch() code of the caller. But if fetchForUpdate() waited on a transaction, it will release the original candidate and fetch a new one. If this new one is not the same record, and it causes a deadlock, then an exception is thrown and the caller will release the wrong record. The solution is to release before throwing an exception, and then not release in the catch{} code of the caller when fetchForUpdate was called. This test no longer hits the assert in Record::release(). But it does hit an assert that was hit several times in the past. This is; ASSERT(!bdb->syncObject.isLocked()); in Cache::fakePage() line 275. This was reported at one point in both Bug#30032 and Bug#28174. Each time, the bug morphed into some other problem and this ASSERT was lost sight of, or not repeated exactly. Now that many of the other problems have been fixed, I can reproduce this readily with 5 or more threads and less than 5 minutes run time. I recommend that we open a new bug to track this fakePage assert separately and not let the bug morph into something else.
[16 Aug 2007 18:13]
Kevin Lewis
I assume Hakan cannot reproduce this bug either, since he asks me to close the bug. The new bug associated with the fakePage assert is Bug#30463