| Bug #19914 | SELECT COUNT(*) sometimes returns MAX_INT on cluster tables | ||
|---|---|---|---|
| Submitted: | 18 May 2006 19:13 | Modified: | 2 Nov 2006 6:56 |
| Reporter: | Hartmut Holzgraefe | ||
| Status: | Closed | ||
| Category: | Server: Cluster | Severity: | S3 (Non-critical) |
| Version: | 4.1.14 | OS: | Linux (linux) |
| Assigned to: | Stewart Smith | Target Version: | |
[18 May 2006 19:13]
Hartmut Holzgraefe
[19 May 2006 8:23]
Jonas Oreland
tomas fixed a bug just like this only last week.... so this is likely a duplicate (if he pushed to 4.1.14) but i think there aslo was a case, where we had no other choise....
[19 May 2006 10:15]
Jonas Oreland
see http://bugs.mysql.com/bug.php?id=19202
[19 May 2006 11:03]
Hartmut Holzgraefe
In that patch i see additional retry functionality and the transaction finally being closed but no propagation of error messages? Would this help in the case that i have been able to observe where the MAX_INT result was returned only after multiples of 60s?
[23 May 2006 14:42]
Stewart Smith
I've chatted with Hartmut about this, said I would have a look at it while others were
moving office.
Spoke to Timour on IRC about how the optimizer handles COUNT(*) and interacts with the
::info(int) handler call (summary below):
<timour> in opt_sum_query() see this:
<timour> /*
<timour> If the storage manager of 'tl' gives exact row count, compute the total
<timour> number of rows. If there are no outer table dependencies, this count
<timour> may be used as the real count.
<timour> */
<timour> if (tl->table->file->table_flags() & HA_NOT_EXACT_COUNT)
<timour> {
<timour> is_exact_count= FALSE;
<timour> count= 1; // ensure count != 0
<timour> }
<timour> else
<timour> {
<timour> tl->table->file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK);
<timour> count*= tl->table->file->records;
<timour> }
<stewart> timour: so, on error on getting an exact count, we should perhaps set
HA_NOT_EXACT (and check for this again in optimiser)
<timour> can't you know that in advance?
<timour> or just always set HA_NOT_EXACT
<stewart> what will opt do if HA_NOT_EXACT?
<stewart> could be more expensive.
<stewart> we can easily get exact count from cluster, but it can fail.
<timour> Yes, it will be more expensive for sure.
<stewart> which means extra network hops :(
<timour> When is the earliest phase when cluster may know if it will succeed with getting
exact count?
<stewart> timour: basicall someway through the info call
<timour> stewart: this looks easy to fix in the optimizer, with a second test for
HA_NOT_EXACT_COUNT after the info() call.
<timour> Also grep for all uses of HA_NOT_EXACT_COUNT - there are few.
<stewart> timour: okay, i'll cook up a patch that theorectically solves the bug :)
<timour> ok
<timour> stewart: if possible, figure out some way to do this in the handler.
<stewart> timour: there's a patch that retries a bunch of times, but theoretically this
could fail too. if only after the retries fail we then resort to other methods (which
should have proper error propagating code) i think it should be okay performance wise (and
correctness)
<timour> stewart: then we need to change that particular optimization to check after the
info call one more time if the count is exact.
<timour> Still, without a test example we might be discussing the wrong functionality ...
<stewart> timour: yes, it may not fix their problem, but I think we're fixing a problem
anyway.
<timour> :-)
patch coming...
[23 May 2006 16:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/6770
[23 May 2006 16:07]
Stewart Smith
patch to simulate the error:
===== sql/ha_ndbcluster.cc 1.180 vs edited =====
--- 1.180/sql/ha_ndbcluster.cc 2006-05-16 00:23:54 +10:00
+++ edited/sql/ha_ndbcluster.cc 2006-05-23 23:53:44 +10:00
@@ -5077,11 +5080,15 @@
ndb_get_table_statistics(Ndb* ndb, const char * table,
Uint64* row_count, Uint64* commit_count)
{
+ static int count=0;
+ count++;
DBUG_ENTER("ndb_get_table_statistics");
DBUG_PRINT("enter", ("table: %s", table));
NdbConnection* pTrans= ndb->startTransaction();
do
{
+ if(count>=2)
+ break;
if (pTrans == NULL)
break;
With inserted error, without fix:
mysql> create table t1 (a int primary key auto_increment) engine=ndb; Query OK, 0 rows
affected (0.69 sec)
mysql> insert into t1 values (),(),(); Query OK, 3 rows affected (0.03 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select count(*) from t1;
+------------+
| count(*) |
+------------+
| 4294967295 |
+------------+
1 row in set (0.02 sec)
With inserted error and fix:
mysql> create table t1 (a int primary key auto_increment) engine=ndb; Query OK, 0 rows
affected (0.63 sec)
mysql> insert into t1 values (),(),(); Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
| 3 |
+----------+
1 row in set (0.02 sec)
Although I have not yet checked if the other methods that the optimizer will try can
correctly propagate errors. I would guess they can... as couldn't these be the places
where normal errors occur?
[28 Jun 2006 15:08]
Stewart Smith
discussing (waiting) on monty for best way to solve this in 5.1 so that we do it "properly" instead of the proposed fix for 4.1. IMHO the handler interface should support returning an error on ::records(). in the meantime, we wait.
[4 Jul 2006 3:03]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/8670
[10 Aug 2006 16:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10274 ChangeSet@1.2533, 2006-08-10 22:55:20+08:00, stewart@willster.(none) +21 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables allow handler::info to return an error code (that will be returned to the user)
[28 Sep 2006 15:52]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/12721 ChangeSet@1.2536, 2006-09-28 23:41:37+10:00, stewart@willster.(none) +10 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables post-review fixes as indicated by Serg. manual testing of error cases done in 5.0 due to support for DBUG_EXECUTE_IF to insert errors. Unable to write test case for mysql-test until 5.1 due to support for setting debug options at runtime.
[4 Oct 2006 9:13]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13055 ChangeSet@1.2291, 2006-10-04 17:13:11+10:00, stewart@willster.(none) +2 -0 BUG#19914 MAX_INT on select count(*) fix for federated
[5 Oct 2006 16:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13121 ChangeSet@1.2537, 2006-10-06 00:56:12+10:00, stewart@willster.(none) +30 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables allow handler::info to return an error code (that will be returned to the user) including post-review fixes as indicated by Serg. manual testing of error cases done in 5.0 due to support for DBUG_EXECUTE_IF to insert errors.
[5 Oct 2006 17:12]
Stewart Smith
awaiting 4.1-ndb, 5.0-ndb and 5.1-ndb to be easily mergable again. ETA is hopefully when we'll have this sorted out.
[16 Oct 2006 9:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13718 ChangeSet@1.2266, 2006-10-16 17:41:29+10:00, stewart@willster.(none) +2 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables update federated for fix
[16 Oct 2006 15:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13733 ChangeSet@1.2344, 2006-10-16 23:15:48+10:00, stewart@willster.(none) +2 -0 BUG#19914 select count(*) returns MAX_INT update partition engine for handler::info returning int
[18 Oct 2006 14:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13861 ChangeSet@1.2314, 2006-10-18 22:31:48+10:00, stewart@willster.(none) +2 -0 BUG#19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables update partition engine for handler::info returning int
[19 Oct 2006 16:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13987 ChangeSet@1.2532, 2006-10-20 00:10:44+10:00, stewart@willster.(none) +2 -0 post BUG#19914 patch fix to fix failing ndb_a* tests in autodiscover test, we "select * from t4" where t4 has just been dropped by ndb_drop_table (so no longer exists in cluster but does on local disk). The check for this (apart from just returning table doesn't exist on TC) is in ndb_err (which is called in all error cases in normal handler operations).
[23 Oct 2006 15:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/14187 ChangeSet@1.2538, 2006-10-23 23:46:35+10:00, stewart@willster.(none) +2 -0 Bug #19914 SELECT COUNT(*) sometimes returns MAX_INT on cluster tables fixes for ndb_* tests broken by previous fix be more careful in ndb about setting errors on failure of info call (especially in open)
[25 Oct 2006 13:26]
Stewart Smith
pushed to 4.1-ndb, 5.0-ndb and 5.1-ndb
[1 Nov 2006 15:25]
Jonas Oreland
pushed into 4.1.22
[1 Nov 2006 15:37]
Jonas Oreland
pushed into 5.0.29
[1 Nov 2006 15:52]
Jonas Oreland
pushed into 5.1.13
[2 Nov 2006 6:56]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of
that product and will be incorporated into the next release.
If necessary, you can access the source repository and build the latest available version,
including the bug fix. More information about accessing the source trees is available at
http://dev.mysql.com/doc/en/installing-source.html
Documented bugfix for 4.1.22/5.0.29/5.1.12.
[4 Nov 2006 4:12]
Jon Stephens
*Fix for 5.0 documented in 5.0.30 Release Notes.*
