Bug #35750 ALTER ORDER BY causes data inconsistency
Submitted: 1 Apr 2008 18:00 Modified: 26 May 2010 17:49
Reporter: Philip Stoev Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0 OS:Any
Assigned to: Lars-Erik Bjørk CPU Architecture:Any
Tags: F_DDL

[1 Apr 2008 18:00] Philip Stoev
Description:
When executing a concurrent workload involving an occasional ALTER TABLE ORDER BY, at some point a record that was inserted by a SQL statement can not be found by the second statement in the same transaction.

How to repeat:
A test case will be attached shortly.
[1 Apr 2008 18:31] Philip Stoev
Test case for bug 35750

Attachment: bug35750.zip (application/x-zip-compressed, text), 1.47 KiB.

[1 Apr 2008 18:39] Philip Stoev
Basically we have a conccurrent scenario that does the following two things simultaneously:

1. Inserts a value into a base table, then selects it from there and inserts it into another table. Those actions trigger logging triggers. The second insert would sometimes fail, which is what this bug is about.
2. ALTER ORDER BY on the base tables involved.

To reproduce the bug, please unpack the archive and place the .txt files in mysql-test and the .test files in mysql-test/t. Then run:

$ perl ./mysql-test-run.pl --stress --stress-init-file=bug35750_init.txt \
--stress-test-file=bug35750_run.txt --stress-threads=5 --skip-ndb \
--mysql=--innodb --stress-test-duration=60 \
--mysqld=--falcon-consistent-read=off \
--mysqld=--innodb_lock_wait_timeout=1 \
--mysqld=--falcon_lock_wait_timeout=1

Let the test run to completion. Disregard any on-screen output. Then, in mysql-test/var/stress/, grep the error* files for "null" and you will see the offending error message:

STDERR:mysqltest: At line 9: query 'INSERT INTO inter2 (t1_uuid) SELECT (SELECT t1_uuid FROM inter1 WHERE t1_uuid = @uuid1)' failed: 1048: Column 'uuid' cannot be null

Please ignore any other messages in those log files, e.g. about deadlocks and timeouts. The message in question is produced by the insert trigger -- it tries to insert the value of t1_uuid1 into a non-null column in the log table. At the same time, t1_uuid should not be null since it was freshly inserted from the same transaction.

If the ALTER ORDER BY is commented out, this issue does not occur. The Falcon tables are not really needed, however using two innodb tables causes a different type of deadlock and this issue does not show up.

Please let me know if any other action is required on my part, e.g. simplifying the test case even further.
[7 Apr 2008 15:58] Heikki Tuuri
Calvin is our 6.0 release manager.

The problem here may be that consistent read SELECTs do not see the history information, since the table rebuild removes it (a known limitation of InnoDB). But this could also be a more basic bug.
[8 Apr 2008 20:30] Philip Stoev
This bug is also observed with in a Falcon-only setup, from 6.0-falcon-team tree.
[9 Apr 2008 8:55] Philip Stoev
I am sorry for the confusion, after reviewing the test case, it turns out that it is the *Falcon* table which fails to return the record that was just inserted.

I am moving the bug to the Falcon category and setting Kevin as lead.
[14 Apr 2008 14:05] Philip Stoev
Additional comments by Kevin:

"The alter table has to be done between the other transactions, and when it
runs there are no records to change.  And Heikki is correct that all old
record versions are lost when the alter occurs.  But Falcon will not let the
alter run while a transaction is active I don't think.  So I do not know how
the second insert can get separated or non-visible from the first."
[21 Mar 2009 15:35] Kevin Lewis
Philip, Sorry this has not been addressed for such a long time.  There has been some synchronization of DDL since this was opened.  Can you check if this is still a problem?
[21 Apr 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[30 Apr 2009 15:56] Philip Stoev
This bug is repeatable with its original test case. Since MTR has changed in the meantime, please do the following:

1. Change all mentions of Innodb to Falcon to avoid any association with innody.
2. Create empty .result files to prevent MTR from complaining:

touch r/bug35750_alter_order.result
touch r/bug35750_insert_replace.result
touch r/bug35750_init.result

3. export MTR_VERSION=1

After this, the issue will be reproducible with the original test case. As noted above, please ignore all other errors and look for a "cannot be null" messages in var/stress/error*.txt files.

I am not sure what level of transactional consistency should be preserved across an ALTER. There was a talk of both doing the ALTER between transactions and of invalidating existing transactions upon an ALTER.
[12 May 2009 20:29] Hakan Küçükyılmaz
Verified as described:

hakan@lu0011:~/work/mysql/mysql-6.0-falcon/mysql-test$ grep null var/stress/error*
var/stress/error3.txt:mysqltest: At line 9: query 'INSERT INTO inter2 (t1_uuid) SELECT (SELECT t1_uuid FROM inter1 WHERE t1_uuid = @uuid1)' failed: 1048: Column 'uuid' cannot be null
var/stress/error3.txt:Error 1048 Column 'uuid' cannot be null
[20 May 2009 17:23] Philip Stoev
The problem here is that the ALTER is still not performed in a way that will preserve transactional consistency. My understanding is that after the latest changes in the server, the ALTER will only run if there are no other transactions pending. The test case from this bug clearly shows that this is not the case -- a record disappears in the middle of a transaction when a concurrent ALTER is run. Instead, the ALTER should have either blocked or refused to run. An alternative is that the entire transaction is made invalid, so that the user is forced to repeat it.