Bug #20442 NDB API: scans with exclusive lock and update of one record cause a deadlock
Submitted: 13 Jun 2006 21:52 Modified: 2 Feb 2007 4:07
Reporter: Anatoly Pidruchny (Candidate Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S3 (Non-critical)
Version:5.0.22-max OS:Any (all)
Assigned to: Jon Stephens CPU Architecture:Any

[13 Jun 2006 21:52] Anatoly Pidruchny
Description:
When multiple processes or threads in parallel are doing a scan with exclusive lock, then update only one record from the subset cached in the client by nextResults(true), then the NdbIndexScanOperation::close() operation (if used), or NdbTransaction::close()/Ndb::closeTransaction() causes a deadlock. Technically, these operations are void and do not return an error to the application, but apparently they cause a deadlock in the server and they finish only after the TransactionDeadlockDetectionTimeout timer expires and the transaction coordinator aborts the transaction. Changing the value of the TransactionDeadlockDetectionTimeout parameter shows exactly that this timer matters. These deadlocks make it impossible for the application to do the job in real time as it supposed to do.

How to repeat:
Please see the attached test program. You will also find the attached config.ini file with the increased TransactionDeadlockDetectionTimeout parameter. The program was tested on a 64-bit Linux and on a HP-UX 11.11 systems. In both cases it reproduced the problem.

The program creates and populates the table mytablename in test_db_1 database, then creates two threads, passing them two different Ndb objects representing two different connections to the test_db_1 database. The threads in parallel repeat the following operations:

1. Start a transaction;
2. Define an NdbIndexScanOperation for the primary key of the mytablename table;
3. Call readTuples with NdbOperation::LM_Exclusive parameter to request exclusive locks to be used;
4. Define the attributes to be retrieved with getValues;
5. Execute the transaction;
6. Call NdbIndexScanOperation::nextResults(true);
7. Create a separate update transaction;
8. Create an update operation using NdbIndexScanOperation::updateCurrentTuple(updateTrans). This call requests the record to be taken over to the update transaction;
9. Set a value of an attribute in the update operation;
10. Executes the update operation;
11. Close the update transaction;
12. Close the NdbIndexScanOperation measuring the time;
13. Close the transaction used for the scan.

The program shows that the two threads are quickly deadlocked in the NdbIndexScanOperation::close() operation (step 12) and the operation takes more then the value of the TransactionDeadlockDetectionTimeout parameter to finish. The example output of the program is:

Populating the table
Thread: 1084578144 - starting scan
Thread: 1095068000 - starting scan
Thread: 1084578144 - updating record 0
Thread: 1084578144 - closing the scan
Thread: 1084578144 - the scan is closed in 0 ms
Thread: 1084578144 - starting scan
Thread: 1084578144 - updating record 0
Thread: 1084578144 - closing the scan
Thread: 1095068000 - updating record 1
Thread: 1095068000 - closing the scan
Thread: 1095068000 - the scan is closed in 11983 ms
Thread: 1095068000 - starting scan
Thread: 1084578144 - the scan is closed in 12001 ms
...
[13 Jun 2006 21:54] Anatoly Pidruchny
Test program

Attachment: scan_update1_deadlock.cpp (application/octet-stream, text), 7.67 KiB.

[13 Jun 2006 21:54] Anatoly Pidruchny
Cluster configuration file

Attachment: config.ini (application/octet-stream, text), 448 bytes.

[14 Jun 2006 1:05] Anatoly Pidruchny
The test program with added NdbScanOperation::SF_OrderBy parameter in readTuples.

Attachment: scan_update1_deadlock.cpp (application/octet-stream, text), 7.69 KiB.

[14 Jun 2006 1:12] Anatoly Pidruchny
Adding the NdbScanOperation::SF_OrderBy parameter into the NdbIndexScanOperation::readTuples made the problem even more evident. After this change the two threads in the test program deadlock even earlier in the NdbIndexScanOperation::nextResult(true). The nextResult operation fail and the NdbError object shows that the deadlock really happened. Now the output of the program looks like:

Linux:

Populating the table
Thread: 1084578144 - starting scan
Thread: 1095068000 - starting scan
Thread: 1084578144 - updating record 0
Thread: 1084578144 - closing the scan
Thread: 1084578144 - the scan is closed in 0 ms
Thread: 1084578144 - starting scan
Error in scan_update1_deadlock.cpp, line: 218, code: 274, msg: Time-out in NDB, probably caused by deadlock.

HP-UX:

Populating the table
Thread: 7 - starting scan
Thread: 8 - starting scan
Thread: 7 - updating record 0
Thread: 7 - closing the scan
Thread: 7 - the scan is closed in 1 ms
Thread: 7 - starting scan
Error in scan_update1_deadlock.cpp, line: 218, code: 274, msg: Time-out in NDB, probably caused by deadlock.
Error in scan_update1_deadlock.cpp, line: 218, code: 274, msg: Time-out in NDB, probably caused by deadlock.
[27 Jun 2006 15:01] Jonas Oreland
Hi,

thx for excellent bug report and test program!

1) The lock-wait-timeout is expected when adding the SF_OrderBy flag
* Each fragment is sorted (locally)
* ndbapi will perform merge sort with rows from all fragments
* two threads can easily deadlock if they obtain locks from different fragments

conclusion: 
* (exclusive) lock scan with SF_OrderBy is not something I would recommend
* I cant (atleast right now) think of any good way to avoid problem

2) scan close - lock-wait-timeout
* this is certainly unnecessary
  the reason that it happens is that ndbapi currently wait for all 
  outstanding fragment scan requests before closing the scan.

/Jonas
[27 Jun 2006 17:46] Anatoly Pidruchny
Hi, Jonas,

1) If the lock-wait-timeout problem with SF_OrderBy can not be avoided then it would be really nice to describe it briefly in the NDB API header files. A couple of sentences in the comments for the readTuples functions would be very helpful.

2) Do I understand you correctly, that you acknowledge the scan close - lock-wait-timeout problem and are going to work on the fix?

Thank you very much,

Anatoly.
[27 Jun 2006 18:14] Jonas Oreland
> 1) If the lock-wait-timeout problem with SF_OrderBy can not be avoided then it
> would be really nice to describe it briefly in the NDB API header files. A
> couple of sentences in the comments for the readTuples functions would be very
> helpful.

* that's a good idea.
* we've also briefly discussed adding an option which will sacrifice
  parallelism, but will reduce likelyhood of deadlock.
  This will however probably not be impl. in the immediate future... 

2) Do I understand you correctly, that you acknowledge the scan close -
lock-wait-timeout problem and are going to work on the fix?

yes, i expect to have a patch for you to test tomorrow...
[28 Jun 2006 9:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8385
[28 Jun 2006 9:31] Jonas Oreland
Hi,

I have now made a patch that fixes the problem.
The scan close time as reported by your program still varies a bit
  on my computer, but this is due to thread scheduling on my machine
  (2.6.12-gentoo-r10 #8 SMP AMD64 Dual core, libc-2.3.6)

And, they are building 5.0.23 today, so I'm afraid that this 
  (or the other patch) will not make that build...but next

Please test patch

/Jonas
[28 Jun 2006 10:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8389
[28 Jun 2006 13:17] Anatoly Pidruchny
Hi, Jonas,

I have just tested the fix. Yes, the deadlock does not happen any more. But what is strange is that the close time varies greatly:

Thread: 7 - the scan is closed in 1 ms
Thread: 7 - the scan is closed in 14 ms
Thread: 8 - the scan is closed in 1 ms
Thread: 8 - the scan is closed in 1 ms
Thread: 7 - the scan is closed in 28 ms
Thread: 8 - the scan is closed in 17 ms
Thread: 7 - the scan is closed in 1 ms
...

As you can see, the close time is either 1 ms or 14 ms or more and very rarely anything in between. I do not think it can be explained just by thread scheduling. It could be because of the adaptive send algorithm or something like that, even though in my test program the close() is called with forceSend set to true.

Bottom line, the fix is OK and should be accepted for the next version. But it would be great if you could find out why the close takes 14 ms or more so often. If you want, I can report another bug for this issue.

Thanks,

Anatoly.
[29 Jun 2006 9:52] Tomas Ulin
pushed to 5.1.12
[4 Jul 2006 11:40] Jonas Oreland
pushed into 5.0.24
[4 Jul 2006 13:21] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://www.mysql.com/doc/en/Installing_source_tree.html
[4 Jul 2006 13:22] Jon Stephens
Documented bugfix in 5.0.24/5.1.12 changelogs. Closed.
[4 Jul 2006 13:32] Jon Stephens
Anatoly, Thanks for a really fine bug report.
[6 Jul 2006 7:37] Jonas Oreland
Hi,

Very sorry to inform you that I have to revert current patch, as it breaks some error cases...

And I've come to unfortunate conclusion that to fix this, I need to make protocol changes...

So I'll revert the patch for now...
[6 Jul 2006 7:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8811
[6 Jul 2006 10:07] Jon Stephens
Commented out bugfix report in 5.0/5.1 changelogs pending further developments.
[1 Feb 2007 17:35] Jonas Oreland
Comment to be added to NdbIndexScanOperation (and docs)

When scanning with exclusive lock, extra care must be taken as if
  two threads performs this simultaniously over same range 
  likelyhood of deadlock is high.

  If also scan is also ordered,
    likelyhood of deadlock is even higher.

  Currently also NdbIndexScan::close() is affected by this deadlock, as
    all outstanding requests are waited for before actual scan in closed.
[1 Feb 2007 17:36] Jonas Oreland
Extra note: The implication on close() will hopefully be resolved 
  in a future release.
[2 Feb 2007 4:07] Jon Stephens
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

I've added cautionary notes to the listings for the NdbScanOperation::readTuples() and ::close() methods as well as a pointer in the scan operations overview section. http://lists.mysql.com/commits/19222