Bug #79366 ORDER BY does not work correctly with grouped AVG() values extracted from JSON
Submitted: 20 Nov 2015 19:04 Modified: 3 Oct 2016 18:34
Reporter: Philip Antoniades Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7.9 OS:Ubuntu (5.7.9 MySQL Community Server (GPL))
Assigned to: CPU Architecture:Any
Tags: avg, json

[20 Nov 2015 19:04] Philip Antoniades
Description:
Added the MongoDB "test database" here: 
https://docs.mongodb.org/getting-started/shell/import-data/
(I can provide a .dump file), data looks like this:
mysql> select doc from mongo_restaurants limit 2\G
*************************** 1. row ***************************
doc: {"_id": {"$oid": "564b3259666906a86ea90a99"}, "name": "Dj Reynolds Pub And Restaurant", "grades": [{"date": {"$date": 1409961600000}, "grade": "A", "score": 2}, {"date": {"$date": 1374451200000}, "grade": "A", "score": 11}, {"date": {"$date": 1343692800000}, "grade": "A", "score": 12}, {"date": {"$date": 1325116800000}, "grade": "A", "score": 12}], "address": {"coord": [-73.98513559999999, 40.7676919], "street": "West   57 Street", "zipcode": "10019", "building": "351"}, "borough": "Manhattan", "cuisine": "Irish", "restaurant_id": "30191841"}
*************************** 2. row ***************************
doc: {"_id": {"$oid": "564b3259666906a86ea90a9a"}, "name": "Wendy'S", "grades": [{"date": {"$date": 1419897600000}, "grade": "A", "score": 8}, {"date": {"$date": 1404172800000}, "grade": "B", "score": 23}, {"date": {"$date": 1367280000000}, "grade": "A", "score": 12}, {"date": {"$date": 1336435200000}, "grade": "A", "score": 12}], "address": {"coord": [-73.961704, 40.662942], "street": "Flatbush Avenue", "zipcode": "11225", "building": "469"}, "borough": "Brooklyn", "cuisine": "Hamburgers", "restaurant_id": "30112340"}
2 rows in set (0.00 sec)

If I select the average score with AVG() it will not sort correctly, though other grouped functions will.

How to repeat:
mysql> select doc->'$.cuisine' cuisine, avg(doc->'$.grades[0].score') score from mongo_restaurants group by cuisine order by score limit 10;
+----------------------+---------+
| cuisine              | score   |
+----------------------+---------+
| "Polynesian"         | 22.0000 |
| "Café/Coffee/Tea"   | 14.5000 |
| "Cajun"              | 10.5714 |
| "Creole/Cajun"       |  5.0000 |
| "Czech"              |  9.3333 |
| "Soups"              |  9.7500 |
| "Californian"        | 12.0000 |
| "Hawaiian"           | 17.3333 |
| "Scandinavian"       |  9.7143 |
| "Iranian"            | 19.0000 |
+----------------------+---------+

Note SUM() does sort correctly:
mysql> select doc->'$.cuisine' cuisine, sum(doc->'$.grades[0].score') score from mongo_restaurants group by cuisine order by score limit 10;
+----------------------+-------+
| cuisine              | score |
+----------------------+-------+
| "Creole/Cajun"       |     5 |
| "Chilean"            |     6 |
| "Californian"        |    12 |
| "Polynesian"         |    22 |
| "Café/Coffee/Tea"   |    29 |
| "Nuts/Confectionary" |    30 |
| "Iranian"            |    38 |
| "Soups"              |    39 |
| "Hawaiian"           |    52 |
| "Czech"              |    56 |
+----------------------+-------+
10 rows in set, 13 warnings (0.46 sec)

Suggested fix:
AVG() + GROUP BY works in non-JSON tables, so I imagine it's a JSON_EXTRACT issue (having to do with conversion from INT to FLOAT?).
[26 Nov 2015 11:37] Knut Anders Hatlen
Thanks for the bug report.

It seems like this is a problem that affects all BLOB-based types, not only the JSON type. For example:

mysql> CREATE TABLE t(txt TEXT, i INT);

mysql> INSERT INTO t VALUES ('a', 2), ('b', 8), ('c', 2);

mysql> SELECT txt, AVG(i) a FROM t GROUP BY txt ORDER BY a;
+------+--------+
| txt  | a      |
+------+--------+
| a    | 2.0000 |
| b    | 8.0000 |
| c    | 2.0000 |
+------+--------+
[13 Aug 2016 7:42] MySQL Verification Team
See also:
http://bugs.mysql.com/bug.php?id=82572
[15 Aug 2016 5:51] Gillian Gunson
Duplicate bug found here: http://bugs.mysql.com/bug.php?id=82334
[27 Sep 2016 11:16] Knut Anders Hatlen
This bug was fixed in 8.0.0 as part of WL#8699.
[3 Oct 2016 18:34] Paul DuBois
Posted by developer:
 
Noted in 5.7.17 changelog.

Queries that were grouped on a column of a BLOB-based type, and that
were ordered on the result of the AVG(), VAR_POP(), or STDDEV_POP()
aggregate function, returned results in the wrong order if InnoDB
temporary tables were used.
[12 Jul 2017 21:51] Rick James
https://stackoverflow.com/questions/44865156/mysql-order-by-avg-desc-not-working-when-cert...  suggests that LONGTEXT may also be at fault.