Bug #41478 | Falcon transactions see uncommited updates | ||
---|---|---|---|
Submitted: | 15 Dec 2008 16:30 | Modified: | 15 May 2009 13:39 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S3 (Non-critical) |
Version: | 6.0-falcon-team | OS: | Any |
Assigned to: | Christopher Powers | CPU Architecture: | Any |
Tags: | F_ISOLATION |
[15 Dec 2008 16:30]
Philip Stoev
[20 Dec 2008 14:02]
Philip Stoev
All that is needed is a concurrent workload that executes the following fully-formed, non-deadlocking rollbacked transaction: 17 Query START TRANSACTION 17 Query INSERT INTO `table10_falcon_int_autoinc` ( `pk` , `int_key` , `int` ) VALUES ( NULL , 123 , 234 ) 17 Query DELETE FROM `table10_falcon_int_autoinc` WHERE `pk` = @@LAST_INSERT_ID 17 Query ROLLBACK A SELECT running concurrently with those queries will sometimes see the row being inserted even though the entire transaction is always rolled back. This happens even with --falcon-consistent-read=1 and REPEATABLE READ transactions.
[26 Jan 2009 21:14]
Kevin Lewis
Kevin wrote, Philip, This seems to be working for me now. Is it working for you? perl runall.pl \ --basedir=/build/bzr/6.0-falcon-team \ --engine=Falcon \ --gendata=conf/transactions.zz \ --grammar=conf/transactions.yy \ --validator=DatabaseConsistency \ --queries=100000 Philip wrote; Kevin, After Hakan's latest merge, this test no longer fails in the way described and this bug is no longer reproducible. I have no idea why.
[29 Jan 2009 15:54]
Philip Stoev
Grammar file for bug 41478
Attachment: bug41478.yy (application/octet-stream, text), 1.47 KiB.
[29 Jan 2009 15:54]
Philip Stoev
To reproduce: $ perl runall.pl \ --basedir=/build/bzr/6.0-falcon-team \ --engine=Falcon \ --grammar=conf/bug41478.yy \ --gendata=conf/transactions.zz \ --validator=DatabaseConsistency \ --mysqld=--falcon-consistent-read=1 \ --queries=100000
[30 Jan 2009 5:27]
Kevin Lewis
I was finally able to reproduce this problem; # 23:22:29 Query: SELECT `int_key` FROM `table10_falcon_int_autoinc` WHERE 1 = 1 returns different result when executed with additional predicate 'AND `pk` > -16777216' (21 vs. 22 rows). # 23:22:29 Query: SELECT `pk` FROM `table10_falcon_int_autoinc` WHERE 1 = 1 returns different result when executed with additional predicate 'AND `pk` > -16777216' (35 vs. 36 rows). # 23:22:29 Query: SELECT `int` FROM `table10_falcon_int_autoinc` WHERE 1 = 1 returns different result when executed with additional predicate 'AND `pk ` > -16777216' (28 vs. 29 rows). # 23:22:29 Query: SELECT `int` FROM `table10_falcon_int_autoinc` WHERE 1 = 1 returns different result when executed with additional predicate 'AND `pk ` > -16777216' (23 vs. 24 rows).
[30 Jan 2009 8:52]
Philip Stoev
Kevin, I think your last comment pertains to bug #41742
[17 Mar 2009 9:45]
Philip Stoev
To reproduce: $ perl runall.pl \ --basedir=/build/bzr/6.0-falcon-team \ --engine=Falcon \ --grammar=conf/bug41478.yy \ --gendata=conf/transactions.zz \ --validator=DatabaseConsistency \ --mysqld=--falcon-consistent-read=1 \ --queries=100000 \ --mem \ --mysqld=--log-output=file \ --threads=3 This will result in an error such as: # 11:42:26 Bad average for table: table10_falcon_int_autoinc; average1: 191.8182; average2: 191.8182; count: 11; affected_rows: 0E0; query: START TRANSACTION Note the count: 11 . This means that the table contained 11 rows, even though the SQL grammar will never commit a new row. The table should have 10 rows only - all inserted rows are immediately deleted and should never be visible to other connections.
[19 Mar 2009 19:27]
Christopher Powers
Debugged by Jim Starkey: Found it with Philips Database Colonoscopy Program (PDC). To trap the problem, I added the following code to StorageTable::next(): int ret = storageDatabase->nextRow(this, recordNumber, lockForUpdate); if (ret >= 0) { if (historyIndex >= recordHistorySize) Log::debug("history overflow\n"); else { RecordHistory *history = recordHistory + historyIndex++; history->record = record; history->recordNumber = record->recordNumber; history->transaction = record->getTransaction(); history->transactionId = record->getTransactionId(); } } 15 seconds later I had a snapshot of the bug. The problem is in Tranasction::removeRecord called from RecordVersion::scavengeSavepoint. Transaction::removeRecord zaps the transaction pointer while the record is still visible to other threads, making it appear to be fully committed.
[19 Mar 2009 19:27]
Christopher Powers
Per Jim: The fix is to remove the line record->transaction = NULL; from Transaction::removeRecordNoLock(RecordVersion *record).
[19 Mar 2009 19:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/69857 3067 Christopher Powers 2009-03-19 Bug #41478, "Falcon transactions see uncommited updates" When removing a record from a transaction, leave RecordVersion::transaction to ensure that the record does not appear to be fully committed and therefore visible to other transactions.
[21 Mar 2009 14:07]
John Embretsen
For the record, this (or a similar) issue has also been observed (only once so far) during transactional mysql-stress-test test suite "stress_tx_rr", using revision hky@sun.com-20090310135113-i36spde1f6duc5xn (mysql-6.0-falcon-team as of March 10). An uncommitted (later rolled back) record was seen by a different REPEATABLE-READ transaction during a test run with 50 concurrent threads on a 4-CPU machine. This was a transient occurrence which did not repeat later in the test run. Seen in one of three 2-hour test runs so far. START TRANSACTION; SELECT * FROM t1 WHERE `is_uncommitted` > 0; pk id int1 int1_key int1_unique int2 int2_key int2_unique for_update timestamp connection_id thread_id is_uncommitted is_consistent +514 50 50 1050 514 -1050 -1000 -514 0 2009-03-21 12:53:48 10112 0 1 0 COMMIT;
[23 Mar 2009 20:04]
Philip Stoev
Jim's fix that Chris committed caused an exception when trying to create falcon tablespaces on startup: 090319 18:34:18 [Note] Falcon: unable to open system data files. 090319 18:34:18 [Note] Falcon: creating new system data files. 090319 18:34:19 [ERROR] Falcon: create database failed : update conflict in table SYSTEM.TABLES record 4 prior open database failure was: can't open file "/tmp/datadir/falcon_master.fts": No such file or directory (2) 090319 18:34:19 [ERROR] Plugin 'Falcon' init function returned error. 090319 18:34:19 [ERROR] Plugin 'Falcon' registration as a STORAGE ENGINE failed. > mysqld.exe!Table::validateAndInsert(Transaction * transaction=0x03ef87d0, RecordVersion * record=0x0661b350) Line 3371 C++ mysqld.exe!Table::update(Transaction * transaction=0x03ef87d0, Record * oldRecord=0x06612600, int numberFields=1, Nfs::Field * * updateFields=0x03ee0b18, Value * * values=0x03ee2840) Line 1381 C++ mysqld.exe!NUpdate::evalStatement(Nfs::Statement * statement=0x03ed40d0) Line 144 C++ mysqld.exe!Nfs::Statement::start(NNode * node=0x03ee25e8) Line 492 + 0x11 C++ mysqld.exe!PreparedStatement::executeUpdate() Line 87 C++ mysqld.exe!Table::reformat() Line 477 + 0xd C++ mysqld.exe!Nfs::Statement::upgradeTable(Syntax * syntax=0x03ef63e8) Line 1382 C++ mysqld.exe!Nfs::Statement::executeDDL() Line 745 C++ mysqld.exe!Nfs::Statement::execute(const char * sqlString=0x00d4aaa0, bool isQuery=false) Line 2727 C++ mysqld.exe!Nfs::Statement::execute(const char * sqlString=0x00d4aaa0) Line 163 C++ mysqld.exe!Database::upgradeSystemTables() Line 2079 + 0x13 C++ mysqld.exe!Database::createDatabase(const char * filename=0x0012de08) Line 703 C++ mysqld.exe!Connection::createDatabase(const char * dbName=0x03ed026c, const char * fileName=0x03ed029c, const char * account=0x00d45d2c, const char * password=0x00d45d24, Threads * threads=0x03ed02c8) Line 1063 + 0x1a C++ mysqld.exe!StorageDatabase::createDatabase() Line 163 C++ mysqld.exe!StorageHandler::createDatabase() Line 1027 C++ mysqld.exe!StorageHandler::initialize() Line 1004 + 0x8 C++ mysqld.exe!StorageInterface::falcon_init(void * p=0x02719368) Line 262 + 0xb C++ mysqld.exe!ha_initialize_handlerton(st_plugin_int * plugin=0x01631248) Line 448 + 0x1b C++ mysqld.exe!plugin_initialize(st_plugin_int * plugin=0x01631248) Line 1008 + 0x15 C++ mysqld.exe!plugin_init(int * argc=0x00f95cbc, char * * argv=0x015a85c0, int flags=2) Line 1220 + 0x9 C++ mysqld.exe!init_server_components() Line 4078 + 0x2f C++ mysqld.exe!win_main(int argc=31, char * * argv=0x015a3a30) Line 4589 + 0x5 C++ mysqld.exe!mysql_service(void * p=0x00000000) Line 4752 + 0x12 C++ mysqld.exe!main(int argc=31, char * * argv=0x015a3a30) Line 4925 + 0x7 C++ mysqld.exe!mainCRTStartup() Line 259 + 0x19 C
[23 Mar 2009 20:07]
John Embretsen
Also for the record: The proposed fix did not work. It resulted in an error on startup, so it was reverted. That is why this bug is still In progress.
[30 Mar 2009 21:42]
Kevin Lewis
Philip reported that this problem cannot be repeated after the changes to create a TransactionState object and the addition of the CycleManager to protect doomed records until after all temporary stack pointers to them have gone away.
[15 May 2009 13:39]
MC Brown
A note have been added to the 6.0.11 changelog: The Falcon CycleManager has been updated, which addresses a number of issues when examining records in various transaction states and their visisbility/isolation in relation to other threads.