Bug #31899 Group functions not working properly when joining tables
Submitted: 28 Oct 2007 19:55 Modified: 29 Oct 2007 11:19
Reporter: Svavar Kjarrval Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.0.45 OS:Any
Assigned to: CPU Architecture:Any
Tags: GROUP, join

[28 Oct 2007 19:55] Svavar Kjarrval
Description:
When aggrigate functions are applied to columns which are derived from 3 or more joined tables, all columns are put into the aggrigated function multiple times depending on how many tables were joined, causing incorrect results. This seems only to happen when multiple tables have columns with the same name, even if they are referenced with their respective table names at all times.

The results are correct when only one join takes place in the query. Then the results from the aggrigated functions multiply when more tables are joined.

It doesn't seem to matter whether the tables are INNODB or MYISAM.

It is confirmed on my end that is happens with at least the SUM() and GROUP_CONCAT() functions but there could be more affected.

How to repeat:
CREATE TABLE `test1` (
`id` MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`name` VARCHAR( 255 ) CHARACTER SET utf8 COLLATE utf8_icelandic_ci NOT NULL 
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_icelandic_ci

CREATE TABLE `test2` (
`test1_id` MEDIUMINT UNSIGNED NOT NULL ,
`number` TINYINT UNSIGNED NOT NULL ,
INDEX ( `test1_id` ) 
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_icelandic_ci

CREATE TABLE `test3` (
`test1_id` MEDIUMINT UNSIGNED NOT NULL ,
`number` TINYINT UNSIGNED NOT NULL ,
INDEX ( `test1_id` ) 
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_icelandic_ci

CREATE TABLE `test4` (
`test1_id` MEDIUMINT UNSIGNED NOT NULL ,
`number` TINYINT UNSIGNED NOT NULL ,
INDEX ( `test1_id` ) 
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_icelandic_ci 

INSERT INTO test1 (name)
VALUES ('Name1'),('Name2');

INSERT INTO test2 (test1_id,number)
VALUES ('1', '5'),('2','5'),('1', '15'), ('2', '15');

INSERT INTO test3 (test1_id,number)
VALUES ('1', '5'),('2','5'),('1', '15'), ('2', '15');

INSERT INTO test4 (test1_id,number)
VALUES ('1', '5'),('2','5'),('1', '15'), ('2', '15');

---
SELECT T1.name,SUM( T2.number ) AS results
FROM test1 AS T1
INNER JOIN test2 AS T2 ON T2.test1_id = T1.id
INNER JOIN test3 AS T3 ON T3.test1_id = T1.id
WHERE T1.id = '1' 
GROUP BY T1.id
---

Expected (and correct) results:
name = 'Name1'
results = '20'

Actual results:
name = 'Name1'
results = '40'

Removing the join to 'test3' makes the results correct. If, however, 'test4' is added, the difference between the actual results and the correct ones will increase and, in this instance, the column 'results' will have the value '160'.

Suggested fix:
None
[29 Oct 2007 4:19] Giuseppe Maxia
Thank you for taking the time to write to us, but this is not a bug. 
You are joining the tables using a non unique key. Thus you have duplicates.

Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php
[29 Oct 2007 11:19] Svavar Kjarrval
This not-a-bug can be solved the following way (for those who suffer from the same problem):

This command produces the correct results for all test tables mentined in the bug report. However, it creates some temporary tables so it might not be the most efficient way to do it.

SELECT T1.id, T1.name, SUM( T2.number ) AS results, result2, result3 
FROM test1 AS T1 
INNER JOIN test2 AS T2 ON T2.test1_id = T1.id
INNER JOIN (SELECT SUM( T3.number ) AS result2 FROM test3 AS T3 WHERE T3.test1_id = '1' GROUP BY T3.test1_id) AS T3 
INNER JOIN (SELECT SUM(T4.number ) AS result3 FROM test4 AS T4 WHERE T4.test1_id = '1' GROUP BY T4.test1_id) AS T4
WHERE T1.id = '1'
GROUP BY T1.id