Bug #19914 | SELECT COUNT(*) sometimes returns MAX_INT on cluster tables | ||
---|---|---|---|
Submitted: | 18 May 2006 17:13 | Modified: | 2 Nov 2006 5:56 |
Reporter: | Hartmut Holzgraefe | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | 4.1.14 | OS: | Linux (linux) |
Assigned to: | Stewart Smith | CPU Architecture: | Any |
[18 May 2006 17:13]
Hartmut Holzgraefe
[19 May 2006 6:23]
Jonas Oreland
tomas fixed a bug just like this only last week.... so this is likely a duplicate (if he pushed to 4.1.14) but i think there aslo was a case, where we had no other choise....
[19 May 2006 8:15]
Jonas Oreland
see http://bugs.mysql.com/bug.php?id=19202
[19 May 2006 9:03]
Hartmut Holzgraefe
In that patch i see additional retry functionality and the transaction finally being closed but no propagation of error messages? Would this help in the case that i have been able to observe where the MAX_INT result was returned only after multiples of 60s?
[23 May 2006 12:42]
Stewart Smith
I've chatted with Hartmut about this, said I would have a look at it while others were moving office. Spoke to Timour on IRC about how the optimizer handles COUNT(*) and interacts with the ::info(int) handler call (summary below): <timour> in opt_sum_query() see this: <timour> /* <timour> If the storage manager of 'tl' gives exact row count, compute the total <timour> number of rows. If there are no outer table dependencies, this count <timour> may be used as the real count. <timour> */ <timour> if (tl->table->file->table_flags() & HA_NOT_EXACT_COUNT) <timour> { <timour> is_exact_count= FALSE; <timour> count= 1; // ensure count != 0 <timour> } <timour> else <timour> { <timour> tl->table->file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK); <timour> count*= tl->table->file->records; <timour> } <stewart> timour: so, on error on getting an exact count, we should perhaps set HA_NOT_EXACT (and check for this again in optimiser) <timour> can't you know that in advance? <timour> or just always set HA_NOT_EXACT <stewart> what will opt do if HA_NOT_EXACT? <stewart> could be more expensive. <stewart> we can easily get exact count from cluster, but it can fail. <timour> Yes, it will be more expensive for sure. <stewart> which means extra network hops :( <timour> When is the earliest phase when cluster may know if it will succeed with getting exact count? <stewart> timour: basicall someway through the info call <timour> stewart: this looks easy to fix in the optimizer, with a second test for HA_NOT_EXACT_COUNT after the info() call. <timour> Also grep for all uses of HA_NOT_EXACT_COUNT - there are few. <stewart> timour: okay, i'll cook up a patch that theorectically solves the bug :) <timour> ok <timour> stewart: if possible, figure out some way to do this in the handler. <stewart> timour: there's a patch that retries a bunch of times, but theoretically this could fail too. if only after the retries fail we then resort to other methods (which should have proper error propagating code) i think it should be okay performance wise (and correctness) <timour> stewart: then we need to change that particular optimization to check after the info call one more time if the count is exact. <timour> Still, without a test example we might be discussing the wrong functionality ... <stewart> timour: yes, it may not fix their problem, but I think we're fixing a problem anyway. <timour> :-) patch coming...
[23 May 2006 14:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/6770
[23 May 2006 14:07]
Stewart Smith
patch to simulate the error: ===== sql/ha_ndbcluster.cc 1.180 vs edited ===== --- 1.180/sql/ha_ndbcluster.cc 2006-05-16 00:23:54 +10:00 +++ edited/sql/ha_ndbcluster.cc 2006-05-23 23:53:44 +10:00 @@ -5077,11 +5080,15 @@ ndb_get_table_statistics(Ndb* ndb, const char * table, Uint64* row_count, Uint64* commit_count) { + static int count=0; + count++; DBUG_ENTER("ndb_get_table_statistics"); DBUG_PRINT("enter", ("table: %s", table)); NdbConnection* pTrans= ndb->startTransaction(); do { + if(count>=2) + break; if (pTrans == NULL) break; With inserted error, without fix: mysql> create table t1 (a int primary key auto_increment) engine=ndb; Query OK, 0 rows affected (0.69 sec) mysql> insert into t1 values (),(),(); Query OK, 3 rows affected (0.03 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select count(*) from t1; +------------+ | count(*) | +------------+ | 4294967295 | +------------+ 1 row in set (0.02 sec) With inserted error and fix: mysql> create table t1 (a int primary key auto_increment) engine=ndb; Query OK, 0 rows affected (0.63 sec) mysql> insert into t1 values (),(),(); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select count(*) from t1; +----------+ | count(*) | +----------+ | 3 | +----------+ 1 row in set (0.02 sec) Although I have not yet checked if the other methods that the optimizer will try can correctly propagate errors. I would guess they can... as couldn't these be the places where normal errors occur?
[28 Jun 2006 13:08]
Stewart Smith
discussing (waiting) on monty for best way to solve this in 5.1 so that we do it "properly" instead of the proposed fix for 4.1. IMHO the handler interface should support returning an error on ::records(). in the meantime, we wait.
[4 Jul 2006 1:03]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/8670
[10 Aug 2006 14:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10274 ChangeSet@1.2533, 2006-08-10 22:55:20+08:00, stewart@willster.(none) +21 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables allow handler::info to return an error code (that will be returned to the user)
[28 Sep 2006 13:52]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/12721 ChangeSet@1.2536, 2006-09-28 23:41:37+10:00, stewart@willster.(none) +10 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables post-review fixes as indicated by Serg. manual testing of error cases done in 5.0 due to support for DBUG_EXECUTE_IF to insert errors. Unable to write test case for mysql-test until 5.1 due to support for setting debug options at runtime.
[4 Oct 2006 7:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13055 ChangeSet@1.2291, 2006-10-04 17:13:11+10:00, stewart@willster.(none) +2 -0 BUG#19914 MAX_INT on select count(*) fix for federated
[5 Oct 2006 14:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13121 ChangeSet@1.2537, 2006-10-06 00:56:12+10:00, stewart@willster.(none) +30 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables allow handler::info to return an error code (that will be returned to the user) including post-review fixes as indicated by Serg. manual testing of error cases done in 5.0 due to support for DBUG_EXECUTE_IF to insert errors.
[5 Oct 2006 15:12]
Stewart Smith
awaiting 4.1-ndb, 5.0-ndb and 5.1-ndb to be easily mergable again. ETA is hopefully when we'll have this sorted out.
[16 Oct 2006 7:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13718 ChangeSet@1.2266, 2006-10-16 17:41:29+10:00, stewart@willster.(none) +2 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables update federated for fix
[16 Oct 2006 13:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13733 ChangeSet@1.2344, 2006-10-16 23:15:48+10:00, stewart@willster.(none) +2 -0 BUG#19914 select count(*) returns MAX_INT update partition engine for handler::info returning int
[18 Oct 2006 12:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13861 ChangeSet@1.2314, 2006-10-18 22:31:48+10:00, stewart@willster.(none) +2 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables update partition engine for handler::info returning int
[19 Oct 2006 14:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13987 ChangeSet@1.2532, 2006-10-20 00:10:44+10:00, stewart@willster.(none) +2 -0 post BUG#19914 patch fix to fix failing ndb_a* tests in autodiscover test, we "select * from t4" where t4 has just been dropped by ndb_drop_table (so no longer exists in cluster but does on local disk). The check for this (apart from just returning table doesn't exist on TC) is in ndb_err (which is called in all error cases in normal handler operations).
[23 Oct 2006 13:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/14187 ChangeSet@1.2538, 2006-10-23 23:46:35+10:00, stewart@willster.(none) +2 -0 Bug #19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables fixes for ndb_* tests broken by previous fix be more careful in ndb about setting errors on failure of info call (especially in open)
[25 Oct 2006 11:26]
Stewart Smith
pushed to 4.1-ndb, 5.0-ndb and 5.1-ndb
[1 Nov 2006 14:25]
Jonas Oreland
pushed into 4.1.22
[1 Nov 2006 14:37]
Jonas Oreland
pushed into 5.0.29
[1 Nov 2006 14:52]
Jonas Oreland
pushed into 5.1.13
[2 Nov 2006 5:56]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at http://dev.mysql.com/doc/en/installing-source.html Documented bugfix for 4.1.22/5.0.29/5.1.12.
[4 Nov 2006 3:12]
Jon Stephens
*Fix for 5.0 documented in 5.0.30 Release Notes.*