Bug #40953 SELECT query throws "ERROR 1062 (23000): Duplicate entry..." error
Submitted: 23 Nov 2008 2:20 Modified: 28 Jan 2009 22:00
Reporter: Shawn Green Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.37,5.0.72,5.1.22, 6.0 OS:Any
Assigned to: Sergei Glukhov CPU Architecture:Any

[23 Nov 2008 2:20] Shawn Green
Description:
A SELECT query can throw a "Duplicate entry..." error

How to repeat:
Reproduction case:
drop table if exists Cres;
drop table if exists Acs;
drop table if exists AEs;

CREATE TABLE `Cres` (
`CreId` bigint(20) NOT NULL default '0',
`AId` int(11) NOT NULL default '0',
`AEId` bigint(20) NOT NULL default '0',
`VUrl` varchar(255) NOT NULL default '',
PRIMARY KEY (`CreId`),
KEY `VUrlPrefix` (`VUrl`),
KEY `AId` (`AId`)
) ENGINE=InnoDB;

CREATE TABLE `Acs` (
`AId` int(11) NOT NULL default '0',
`Pro` enum('A1','A2','A3') NOT NULL default 'A1',
`CuId` int(11) NOT NULL default '0',
PRIMARY KEY (`AId`),
KEY `CuId` (`CuId`)
) ENGINE=InnoDB;

CREATE TABLE `AEs` (
`AEId` bigint(20) NOT NULL default '0',
`LastModified` datetime NOT NULL default '1980-01-01 00:00:00',
PRIMARY KEY (`AEId`)
) ENGINE=InnoDB;

insert into Cres values (1, 1, 1, 'abc');
insert into Cres values (2, 1, 2, 'def');
insert into Cres values (3, 1, 2, 'def');
insert into Acs values (1, 'A1', 1);
insert into AEs values (1, '1980-01-01');

-- this query is OK, run it to see what the result set is prior to
aggregation
SELECT
a.CuId,
cr.VUrl
FROM Acs a
STRAIGHT_JOIN Cres cr ON cr.AId=a.AId
LEFT JOIN (
Cres cr2
JOIN AEs ae2 ON cr2.AEId=ae2.AEId
) ON a.AId=cr2.AId AND ae2.LastModified < now() - INTERVAL 7 DAY AND
cr.VUrl=cr2.VUrl
WHERE cr2.VUrl IS NULL
AND a.Pro='A1';

explain SELECT
a.CuId,
cr.VUrl,
count(*) count
FROM Acs a
STRAIGHT_JOIN Cres cr ON cr.AId=a.AId
LEFT JOIN (
Cres cr2
JOIN AEs ae2 ON cr2.AEId=ae2.AEId
) ON a.AId=cr2.AId AND ae2.LastModified < now() - INTERVAL 7 DAY AND
cr.VUrl=cr2.VUrl
WHERE cr2.VUrl IS NULL
AND a.Pro='A1'
GROUP BY a.CuId, cr.VUrl;

-- this query should get an error
SELECT
a.CuId,
cr.VUrl,
count(*) count
FROM Acs a
STRAIGHT_JOIN Cres cr ON cr.AId=a.AId
LEFT JOIN (
Cres cr2
JOIN AEs ae2 ON cr2.AEId=ae2.AEId
) ON a.AId=cr2.AId AND ae2.LastModified < now() - INTERVAL 7 DAY AND
cr.VUrl=cr2.VUrl
WHERE cr2.VUrl IS NULL
AND a.Pro='A1'
GROUP BY a.CuId, cr.VUrl;

Suggested fix:
[the original reporter] have tracked this back to make_join_statistics(). I think this is the root
cause. Code there can set table->maybe_null when join->outer_join is set and
one table is dependent on another table. But this dependent relationship
seems to be different than the dependency created by an outer join. For the
problem query, there are two instances of Cre (cr, cr2). cr2 should have
maybe_null set in the st_table instance. cr should not. But because of this
code and because cr is considered to be dependent on Accounts, the code sets
st_table::maybe_null for cr.

/*
Build transitive closure for relation 'to be dependent on'.
This will speed up the plan search for many cases with outer joins,
as well as allow us to catch illegal cross references/
Warshall's algorithm is used to build the transitive closure.
As we use bitmaps to represent the relation the complexity
of the algorithm is O((number of tables)^2).
*/
for (i= 0, s= stat ; i < table_count ; i++, s++)
{
for (uint j= 0 ; j < table_count ; j++)
{
table= stat[j].table;
if (s->dependent & table->map)
s->dependent |= table->reginfo.join_tab->dependent;
}
if (s->dependent)
s->table->tb_maybe_null= 1;
}

This is different than Bug report #31349 
http://bugs.mysql.com/bug.php?id=31349
That bug appears to match. However by the changelog entry, the fix for this bug was to change the function INET_NTOA() and not to fix the null-handling code for the query processor.  This bug isolates the underlying flaw in the null-handling code.
[25 Nov 2008 10:37] Sveta Smirnova
Fails with 5.0.67 for me and with shorter query:

SELECT a.CuId, cr.VUrl, count(*) count FROM Acs a STRAIGHT_JOIN Cres cr ON cr.AId=a.AId LEFT JOIN ( Cres cr2 JOIN AEs ae2 ON cr2.AEId=ae2.AEId ) ON a.AId=cr2.AId GROUP BY a.CuId, cr.VUrl;
[16 Dec 2008 16:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61783

2744 Sergey Glukhov	2008-12-16
      Bug#40953 SELECT query throws "ERROR 1062 (23000): Duplicate entry..." error
      There are two instances of Cre (cr, cr2). cr2 should have
      maybe_null set in the st_table instance, cr should not.
      Because cr is considered to be dependent on Accounts,
      the code sets st_table::maybe_null for cr into 'true'.
      The fix is to set st_table::maybe_null to 'true' only
      for those tables which are used in outer join.
[16 Dec 2008 17:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/61803

2744 Sergey Glukhov	2008-12-16
      Bug#40953 SELECT query throws "ERROR 1062 (23000): Duplicate entry..." error
      There are two instances of t1 (cr, cr2). cr2 should have
      maybe_null set in the st_table instance, cr should not.
      Because cr is considered to be dependent on t2,
      the code sets st_table::maybe_null for cr into 'true'.
      The fix is to set st_table::maybe_null to 'true' only
      for those tables which are used in outer join.
[23 Dec 2008 9:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62239

2744 Sergey Glukhov	2008-12-23
      Bug#40953 SELECT query throws "ERROR 1062 (23000): Duplicate entry..." error
      Table could be marked dependent because it is
      either 1) an inner table of an outer join, or 2) it is a part of
      STRAIGHT_JOIN. In case of STRAIGHT_JOIN table->maybe_null should not
      be assigned. The fix is to set st_table::maybe_null to 'true' only
      for those tables which are used in outer join.
[24 Dec 2008 15:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62302

2724 Sergey Glukhov	2008-12-24
      Bug#40953 SELECT query throws "ERROR 1062 (23000): Duplicate entry..." error
      Table could be marked dependent because it is
      either 1) an inner table of an outer join, or 2) it is a part of
      STRAIGHT_JOIN. In case of STRAIGHT_JOIN table->maybe_null should not
      be assigned. The fix is to set st_table::maybe_null to 'true' only
      for those tables which are used in outer join.
[6 Jan 2009 13:57] Bugs System
Pushed into 5.0.76 (revid:joro@sun.com-20090105160414-8q9j4bi1klkfwiup) (version source revid:azundris@mysql.com-20081230114734-nmsc37ak330zlygn) (merge vers: 5.0.76) (pib:6)
[7 Jan 2009 20:47] Paul DuBois
Noted in 5.0.76 changelog.

Certain SELECT queries could fail with a "Duplicate entry" error.

Setting report to NDI pending push into 5.1.x/6.0.x.
[15 Jan 2009 6:40] Bugs System
Pushed into 5.1.31 (revid:joro@sun.com-20090115053147-tx1oapthnzgvs1ro) (version source revid:azundris@mysql.com-20081230114838-cn52tu180wcrvh0h) (merge vers: 5.1.31) (pib:6)
[15 Jan 2009 16:34] Paul DuBois
Noted in 5.1.31 changelog.

Setting report to NDI pending push into 6.0.x.
[19 Jan 2009 11:33] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090115073240-1wanl85vlvw2she1) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:10] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 15:10] Jon Stephens
Setting status back to NDI pending merge to 6.0 tree.
[19 Jan 2009 16:15] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)
[20 Jan 2009 19:00] Bugs System
Pushed into 6.0.10-alpha (revid:joro@sun.com-20090119171328-2hemf2ndc1dxl0et) (version source revid:azundris@mysql.com-20081230114916-c290n83z25wkt6e4) (merge vers: 6.0.9-alpha) (pib:6)
[28 Jan 2009 22:00] Paul DuBois
Noted in 6.0.10 changelog.