Bug #2467 Use of an alias in a 'group by function'?
Submitted: 21 Jan 2004 1:56 Modified: 7 Nov 2005 18:14
Reporter: Sébastien Vanvelthem Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:4.0.17 OS:Any (All)
Assigned to: Assigned Account CPU Architecture:Any

[21 Jan 2004 1:56] Sébastien Vanvelthem
Description:
Hi,

In a statistical application we need to let MySQL handle some calculation. We did it before with stored procedures, but we decide to migrate to MySQL for speed. Unfortunatly, some things seems to be very slow if we use only SQL. In the following case the speed of MySQL is compromised by some tricks :

   select month(o.order_date) as period,
       sum(l.qty * l.unit_price) as order_total,
       sum(l.qty * l.unit_price * (100-l.discount_1)/100)) as total_disc_1,
       sum(l.qty * l.unit_price * (100-l.discount_2)/100)) as total_disc_2,
       sum(l.qty * l.unit_price * (100-l.discount_3)/100)) as total_disc_3

   from orders as o inner join order_lines as l
   on a.id_order = l.id_order

   where l.id_product = 427891
   and year(o.order_date) = 2003

   group by period

In the above query the instruction "l.qty * l.unit_price" is repeated 4 times. I thought the optimizer can handle this case. But after some tests I figured out that it really recalculates 4 times (on approx 2 million order lines). The CPU load reaches 80% for about 15 seconds. If I simply replace the last 3 "l.qty * l.unit_price" by a constant, time is divided by 2.

With stored-procedures on another open-source database, time is divided by 4 or 10 in some cases. I'm confident in MySQL evolution, and you'll have procedures but for now just a suggestion which can help us a lot :

Could you permit the use of an alias in a 'group by function'? The example below shows a concrete case

   select month(o.order_date) as period,
       sum(l.qty * l.unit_price) as order_total,
       sum(order_total * (100-l.discount_1)/100)) as total_disc_1,
       sum(order_total * (100-l.discount_2)/100)) as total_disc_2,
       sum(order_total * (100-l.discount_3)/100)) as total_disc_3

   from orders as o inner join order_lines as l
   on a.id_order = l.id_order

   where l.id_product = 427891
   and year(o.order_date) = 2003

   group by period

Optimizer will never be faked anymore. And we can start selling MySQL Licenses to our customers right now ;)

Thanks for your time

Sébastien

PS : the above examples are fictives, some people may think that the database design is wrong. But we have real-world example where we need to do things like ->

sum(if(l.ghj<0, 0, (l.sss * l.ccc * (t.aaa / x.uuu))),

So we are impatient to have stored procedures... We abandonned some other databases due to speed, but even if simple queries are faster on MySQL it seems that complex requests are slower :/

How to repeat:
feature request
[27 Jan 2004 6:52] Thomas Mayer
Have you ever tried something like this:
group by 1
order by 1,2

This could avoid your overhead in calculation.
[29 Jan 2004 3:02] Sébastien Vanvelthem
That's not a problem with sorting...

An conceptual example is :

select 
  (table.quantity * table.price) as total, 
  (table.quantity * table.price / table.some_value) as total_adapted
from table

Why can we do

select 
  (table.quantity * table.price) as total, 
  (total / table.some_value) as total_adapted
from table

instead?

whithout recalculating (table.quantity * table.price)

Hope it helps to make things clearer

Thanks
[7 Oct 2005 18:14] Hartmut Holzgraefe
this looks like a job for subqueries?

SELECT t1.total, , t1.total / t1.some_value 
FROM (SELECT *, (table.quantity * table.price) as total FROM table) AS t1
[8 Nov 2005 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".