Bug #39867 | MySQL Cluster : Failures during Blob part operations not always detected | ||
---|---|---|---|
Submitted: | 5 Oct 2008 23:19 | Modified: | 12 Nov 2008 12:10 |
Reporter: | Frazer Clement | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | mysql-5.1-telco-6.2.15 | OS: | Any |
Assigned to: | Frazer Clement | CPU Architecture: | Any |
[5 Oct 2008 23:19]
Frazer Clement
[5 Oct 2008 23:20]
Frazer Clement
Modified example to create blob part op failure due to Send exhaustion
Attachment: blobtest-1.pl (application/x-perl, text), 1.79 KiB.
[3 Nov 2008 15:41]
Frazer Clement
Proposed patch
Attachment: bug39867+39879.patch (text/x-patch), 192.77 KiB.
[3 Nov 2008 15:53]
Frazer Clement
Proposed patch which : 1) Improves ha_ndbcluster error handling to check operation and transaction objects when readData() returns a bad rc. 2) Modifies LQH to send LQHKEYREF to TC (rather than TCKEYREF to API) in case where simple read fails. This allows TC to implement AbortOnError behaviour for Simple reads. Dirty (Committed) read still sends direct TCKEYREF and cannot use AbortOnError. 3) Adds test to MTR ndb_blob which attempt to overload the API connection with Blob reads and verify that : a) If no error is reported, the data is correct. b) If an error is reported, it is the correct type
[3 Nov 2008 16:17]
Frazer Clement
Note that the patch relies on the 'parent object accessors' fix to Bug#40242. Also note that this patch fixes Bug#39879 as well
[5 Nov 2008 9:54]
Frazer Clement
Patch with extra simple read tests re-enabled
Attachment: bug39867-extratests.patch (text/x-patch), 202.81 KiB.
[5 Nov 2008 10:17]
Frazer Clement
New patch with extra testcases. Existing Simple Read testcases re-enabled. New testcase for failing AbortOnError Simple Read followed by successful normal read. Before fix, this testcase results in Transaction Abort due to timeout. Afterwards it results in Transaction Abort due to Transporter overload. Error insert is necessary for this testcase as the bug was on the early 'noFreeRecord' error handling path rather than later error handling paths.
[5 Nov 2008 19:52]
Jonas Oreland
one comment: 5047 is "self-cleaning" when encountered. if running this on 4-node cluster, there will be lingering 5047's set so a insertErrorAllNodes(0) after testcase is a good idea after that ok to push
[6 Nov 2008 17:38]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/58086 2719 Frazer Clement 2008-11-06 Bug#39867 MySQL Cluster : Failures during Blob part operations not always detected 2 parts : 1) The Ndb SQL handler (ha_ndbcluster) reported the error from the NdbBlob object rather than from the NdbTransaction object. This results in inconsistent error messages in some cases 2) The NDB kernel bypassed the TC block when reporting primary key 'simple read' failure in some scenarios. This resulted in the API node not detecting operation failures in some scenarios, and eventual transaction timeouts. Fixes : Change NDB kernel to send LQHKEYREF to TC for early simple read failure. Direct send of TCKEYREF to API remains for 'dirty' read. Change ha_ndbcluster to obtain error information from the NdbTransaction object rather than the Blob object. Tests : Re-enable simple read testing in testOperations and testTransactions. Extend testing to include Simple Reads in testNdbApi. Add Blob read transporter overload testcase to MTR test_blob testcase.
[6 Nov 2008 21:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/58115 2722 Frazer Clement 2008-11-06 Bug#39867 MySQL Cluster : Failures during Blob part operations not always detected Add new testcase to Daily Basic
[8 Nov 2008 20:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/58252 2724 Frazer Clement 2008-11-08 Bug#39867 MySQL Cluster : Failures during Blob part operations not always detected 2 parts : 1) The Ndb SQL handler (ha_ndbcluster) reported the error from the NdbBlob object rather than from the NdbTransaction object. This results in inconsistent error messages in some cases 2) The NDB kernel bypassed the TC block when reporting primary key 'simple read' failure in some scenarios. This resulted in the API node not detecting operation failures in some scenarios, and eventual transaction timeouts. Fixes : Change NDB kernel to send LQHKEYREF to TC for early simple read failure. Direct send of TCKEYREF to API remains for 'dirty' read. Change ha_ndbcluster to obtain error information from the NdbTransaction object rather than the Blob object. Tests : Re-enable simple read testing in testOperations and testTransactions. Extend testing to include Simple Reads in testNdbApi. Add Blob read transporter overload testcase to MTR test_blob testcase.
[8 Nov 2008 21:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/58255 2727 Frazer Clement 2008-11-08 Bug#39867 MySQL Cluster : Failures during Blob part operations not always detected Add new testcase to Daily Basic
[8 Nov 2008 22:19]
Bugs System
Pushed into 5.1.29-ndb-6.4.0 (revid:frazer@mysql.com-20081108210806-iiu8s4ytv8gvbo98) (version source revid:frazer@mysql.com-20081108214303-z8nr2z5c1yccxac8) (pib:5)
[8 Nov 2008 22:43]
Bugs System
Pushed into 5.1.29-ndb-6.2.17 (revid:frazer@mysql.com-20081108210806-iiu8s4ytv8gvbo98) (version source revid:frazer@mysql.com-20081108210806-iiu8s4ytv8gvbo98) (pib:5)
[8 Nov 2008 22:45]
Bugs System
Pushed into 5.1.29-ndb-6.3.19 (revid:frazer@mysql.com-20081108210806-iiu8s4ytv8gvbo98) (version source revid:frazer@mysql.com-20081108212257-xppq7h6xmg3wduzp) (pib:5)
[12 Nov 2008 12:10]
Jon Stephens
Documented in the ndb-6.2.17 and ndb-6.3.19 changelogs as follows: Failed operations on BLOB and TEXT columns were not always reported correctly to the originating SQL node.
[12 Nov 2008 13:18]
Jon Stephens
Combined changelog entry with entry for Bug#39879, updated entry to read as follows: Failed operations on BLOB and TEXT columns were not always reported correctly to the originating SQL node. Such errors were sometimes reported as being due to timeouts, when the actual problem was a transporter overload due to insufficient buffer space.
[12 Dec 2008 23:28]
Bugs System
Pushed into 6.0.9-alpha (revid:frazer@mysql.com-20081108210806-iiu8s4ytv8gvbo98) (version source revid:tomas.ulin@sun.com-20081209185954-9svcixh2p5hsfi6w) (pib:5)