Bug #4907 feature request - SQL extensions
Submitted: 5 Aug 2004 9:23 Modified: 8 Aug 2004 1:03
Reporter: Ram Nahtaniel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:any OS:Any (any)
Assigned to: CPU Architecture:Any

[5 Aug 2004 9:23] Ram Nahtaniel
Description:
Hi,

I am a developer working with many databases, as a part of my job. I use heavy SQL queries and have become somewhat of an expert in SQL, including tricks and workarounds of the limitation of the language.

I feel like a lot of the workarounds could be avoided with adding a few new operators to the SQL language that should be rather easy to support but would give a vast improvement and probably a leap of performance in many common queries. I write you about this hoping that you would support these operators (even though they are not in the ANSI) and thereby position MYSQL as a leader and not just a follower! I personaly have a great interest in seeing open source software making it big time! So please - forward this to one of your more technical developers/executives and hopefully I will get to see it in the next version of MYSQL:

1) The operator  "of max":
suppose I have a table "grades" of 3 fields: class/student/grade where I store many grades of many students of many classes. I want to get the name of the highest scoring student in each class. Note that there may be many students with the same grade, but for starters let's say there is a primary key of class+grade.

My query would now be:
select student from grades where class+'#'+grade in 
(
   select class+'#'+max(grade) from grades group by class
) a

This means working the query twice - and relying on louzy conversion of the grade from numerical to textual.
We could also use:

select student from grades where student in 
(
  select student from grades group by class 
  having class+'#'+grade = max(class+'#'+grade)
) a

This is even worse!

The optimal would be to introduce a new operator "of max" that would be used as follows:

select student of max(grade) from grades group by class

simillarly one should support "of min" and "of any" (brings a representative of the group)

2) aggregated concatenation:
Traditionally the SQL language has referred from supporting order dependent operators from taking a role in aggregated functions. This means that since the query: "select class, grade from grades" does not ensure the order of the records returned, the operation sum() is supported (i.e. select class, sum(grade) from grades group by class) but other operations that would be order dependent are not supported.
I think this approach should be revised. In many cases one would want to get a list of the student names delimited with a comma. It would be great if one could write:
select class, list(student, ',') from grades group by class
and get

class  list
-----  ----
class1 john, ruth,...
.
.
.

This is of course an order dependent operation so the syntax can either be:
select class, list(student, ',') from grades group by class order by student, grade
in which case the list would be ordered before the list is created, or if no particular order is requested the concatenation of the names should be in an arbitrary order.

Well - that's all for now :-)

Good luck!

Ram

How to repeat:
Hi,

I am a developer working with many databases, as a part of my job. I use heavy SQL queries and have become somewhat of an expert in SQL, including tricks and workarounds of the limitation of the language.

I feel like a lot of the workarounds could be avoided with adding a few new operators to the SQL language that should be rather easy to support but would give a vast improvement and probably a leap of performance in many common queries. I write you about this hoping that you would support these operators (even though they are not in the ANSI) and thereby position MYSQL as a leader and not just a follower! I personaly have a great interest in seeing open source software making it big time! So please - forward this to one of your more technical developers/executives and hopefully I will get to see it in the next version of MYSQL:

1) The operator  "of max":
suppose I have a table "grades" of 3 fields: class/student/grade where I store many grades of many students of many classes. I want to get the name of the highest scoring student in each class. Note that there may be many students with the same grade, but for starters let's say there is a primary key of class+grade.

My query would now be:
select student from grades where class+'#'+grade in 
(
   select class+'#'+max(grade) from grades group by class
) a

This means working the query twice - and relying on louzy conversion of the grade from numerical to textual.
We could also use:

select student from grades where student in 
(
  select student from grades group by class 
  having class+'#'+grade = max(class+'#'+grade)
) a

This is even worse!

The optimal would be to introduce a new operator "of max" that would be used as follows:

select student of max(grade) from grades group by class

simillarly one should support "of min" and "of any" (brings a representative of the group)

2) aggregated concatenation:
Traditionally the SQL language has referred from supporting order dependent operators from taking a role in aggregated functions. This means that since the query: "select class, grade from grades" does not ensure the order of the records returned, the operation sum() is supported (i.e. select class, sum(grade) from grades group by class) but other operations that would be order dependent are not supported.
I think this approach should be revised. In many cases one would want to get a list of the student names delimited with a comma. It would be great if one could write:
select class, list(student, ',') from grades group by class
and get

class  list
-----  ----
class1 john, ruth,...
.
.
.

This is of course an order dependent operation so the syntax can either be:
select class, list(student, ',') from grades group by class order by student, grade
in which case the list would be ordered before the list is created, or if no particular order is requested the concatenation of the names should be in an arbitrary order.

Well - that's all for now :-)

Good luck!

Ram

Suggested fix:
Hi,

I am a developer working with many databases, as a part of my job. I use heavy SQL queries and have become somewhat of an expert in SQL, including tricks and workarounds of the limitation of the language.

I feel like a lot of the workarounds could be avoided with adding a few new operators to the SQL language that should be rather easy to support but would give a vast improvement and probably a leap of performance in many common queries. I write you about this hoping that you would support these operators (even though they are not in the ANSI) and thereby position MYSQL as a leader and not just a follower! I personaly have a great interest in seeing open source software making it big time! So please - forward this to one of your more technical developers/executives and hopefully I will get to see it in the next version of MYSQL:

1) The operator  "of max":
suppose I have a table "grades" of 3 fields: class/student/grade where I store many grades of many students of many classes. I want to get the name of the highest scoring student in each class. Note that there may be many students with the same grade, but for starters let's say there is a primary key of class+grade.

My query would now be:
select student from grades where class+'#'+grade in 
(
   select class+'#'+max(grade) from grades group by class
) a

This means working the query twice - and relying on louzy conversion of the grade from numerical to textual.
We could also use:

select student from grades where student in 
(
  select student from grades group by class 
  having class+'#'+grade = max(class+'#'+grade)
) a

This is even worse!

The optimal would be to introduce a new operator "of max" that would be used as follows:

select student of max(grade) from grades group by class

simillarly one should support "of min" and "of any" (brings a representative of the group)

2) aggregated concatenation:
Traditionally the SQL language has referred from supporting order dependent operators from taking a role in aggregated functions. This means that since the query: "select class, grade from grades" does not ensure the order of the records returned, the operation sum() is supported (i.e. select class, sum(grade) from grades group by class) but other operations that would be order dependent are not supported.
I think this approach should be revised. In many cases one would want to get a list of the student names delimited with a comma. It would be great if one could write:
select class, list(student, ',') from grades group by class
and get

class  list
-----  ----
class1 john, ruth,...
.
.
.

This is of course an order dependent operation so the syntax can either be:
select class, list(student, ',') from grades group by class order by student, grade
in which case the list would be ordered before the list is created, or if no particular order is requested the concatenation of the names should be in an arbitrary order.

Well - that's all for now :-)

Good luck!

Ram
[8 Aug 2004 1:03] Sergei Golubchik
I'm sure you will be happy to know that we already have aggregated concatenation. The syntax is more powerful than what you suggested:

          GROUP_CONCAT([DISTINCT] EXPR [,EXPR ...]
                       [ORDER BY {UNSIGNED_INTEGER | COL_NAME | EXPR}
                           [ASC | DESC] [,COL_NAME ...]]
                       [SEPARATOR STR_VAL])

so it supports DISTINCT, ORDER BY,  many columns from one row, e.g.

select class, group_concat(distinct first_name, ' ', last_name order by last_name, first_name separator ',') from grades group by class

We'll think about "max of" thanks for the idea