Bug #23498 call to coalesce is not nedded in join statements
Submitted: 20 Oct 2006 12:46 Modified: 25 Nov 2006 7:13
Reporter: R H Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:5.0.26 OS:Windows (Windows XP)
Assigned to: CPU Architecture:Any
Tags: COALESCE, join, joins

[20 Oct 2006 12:46] R H
Description:
Hi

I am just wondering about the Coalesce function and whether it is of any consequense when used to find the join column(s) since version 5.0.12, and there by excluding redundant coloumns.

The documentation above states:

"The redundant column is eliminated" and further
"The single result column that replaces two common columns is defined via the coalesce operation. That is, for two t1.a and t2.a the resulting single join column a is defined as a = COALESCE(t1.a, t2.a), where:
COALESCE(x, y) = (CASE WHEN V1 IS NOT NULL THEN V1 ELSE V2 END)"

Now as far as I can see there is no reason at all to call the coalesce function to determine the resultant coloumns. 

1-If the join was an inner join, we know that both coloumns were identical, so it makes no difference which one is chosen. Besides, NULL values are not returned in inner joins

2- If the join was a left outer join then if t1.a is NULL, t2.a will also be NULL (due to the join). In this case the Coalesce function does infact return the t2.a, but since it is null, and t1.a was also null, it might just as well return t1.a. Ofcourse if t1.a is not NULL then t1.a is returned

So to summ it up, we might just as well always choose the coloumn from the left table (both for left joins and inner joins) and the call to coalesce is merely overhead.

I cant find any example of the coalesce beeing nedded here. Please do correct me if I have missed something. 

Here is an example of two tables that demonstrate the fact that coalesce is superflous:

select * from t1

+--------+--------+--------+--------+
| a      | b      | c      | x      |
+--------+--------+--------+--------+
|      1 |      1 |      1 |      1 |
|      2 |      2 |      2 | [NULL] |
|      3 |      3 | [NULL] | [NULL] |
|      4 | [NULL] | [NULL] | [NULL] |
+--------+--------+--------+--------+

select * from t2

+--------+--------+--------+--------+
| a      | b      | c      | y      |
+--------+--------+--------+--------+
|      1 |      1 |      1 |      1 |
|      2 |      2 |      2 |      2 |
|      3 |      3 |      3 |      3 |
|      4 |      4 |      4 |      4 |
+--------+--------+--------+--------+

select * from t1 left join t2 using(a,b);

+--------+--------+--------+--------+--------+--------+
| a      | b      | c      | x      | c      | y      |
+--------+--------+--------+--------+--------+--------+
|      1 |      1 |      1 |      1 |      1 |      1 |
|      2 |      2 |      2 | [NULL] |      2 |      2 |
|      3 |      3 | [NULL] | [NULL] |      3 |      3 |
|      4 | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] |
+--------+--------+--------+--------+--------+--------+

Kind regards

How to repeat:
there is nothid te repeat really. Just join any tables, and see the results :=)

Suggested fix:
Just take the coloumn from the table that is mentioned first in the join statement
[25 Nov 2006 7:13] Valeriy Kravchuk
Let's continue your example:

mysql> delete from t2 where a=1;
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1 left join t2 using(a,b);
+------+------+------+------+------+------+
| a    | b    | c    | x    | c    | y    |
+------+------+------+------+------+------+
|    1 |    1 |    1 |    1 | NULL | NULL |
|    2 |    2 |    2 | NULL |    2 |    2 |
|    3 |    3 | NULL | NULL |    3 |    3 |
|    4 | NULL | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+
4 rows in set (0.00 sec)

mysql> select * from t1 right join t2 using(a,b);
+------+------+------+------+------+------+
| a    | b    | c    | y    | c    | x    |
+------+------+------+------+------+------+
|    2 |    2 |    2 |    2 |    2 | NULL |
|    3 |    3 |    3 |    3 | NULL | NULL |
|    4 |    4 |    4 |    4 | NULL | NULL |
+------+------+------+------+------+------+
3 rows in set (0.01 sec)

Compare with the following:

mysql> select * from t1 right join t2 ON (t1.a = t2.a and t1.b = t2.b);
+------+------+------+------+------+------+------+------+
| a    | b    | c    | x    | a    | b    | c    | y    |
+------+------+------+------+------+------+------+------+
|    2 |    2 |    2 | NULL |    2 |    2 |    2 |    2 |
|    3 |    3 | NULL | NULL |    3 |    3 |    3 |    3 |
| NULL | NULL | NULL | NULL |    4 |    4 |    4 |    4 |
+------+------+------+------+------+------+------+------+
3 rows in set (0.00 sec)

mysql> select * from t1 left join t2 ON (t1.a = t2.a and t1.b = t2.b);
+------+------+------+------+------+------+------+------+
| a    | b    | c    | x    | a    | b    | c    | y    |
+------+------+------+------+------+------+------+------+
|    1 |    1 |    1 |    1 | NULL | NULL | NULL | NULL |
|    2 |    2 |    2 | NULL |    2 |    2 |    2 |    2 |
|    3 |    3 | NULL | NULL |    3 |    3 |    3 |    3 |
|    4 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+------+
4 rows in set (0.01 sec)

I hope, the (intentional) difference is explained properly in http://dev.mysql.com/doc/refman/5.0/en/join.html.