Bug #42948 | MySQL Optimizer does not properly use Indexes on Views with "Group By" | ||
---|---|---|---|
Submitted: | 17 Feb 2009 23:02 | Modified: | 3 Mar 2009 18:56 |
Reporter: | John Kounis | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.0.67-0ubuntu6, 6.0.11 | OS: | Linux (Ubuntu 8.10) |
Assigned to: | CPU Architecture: | Any | |
Tags: | GROUP BY, Optimizer |
[17 Feb 2009 23:02]
John Kounis
[17 Feb 2009 23:04]
John Kounis
Changed severity from S3 to S5 (Performance)
[28 Feb 2009 19:29]
Valeriy Kravchuk
Formally, this is not a bug. View is created as follows: mysql> show create view qryTEST\G *************************** 1. row *************************** View: qrytest Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `qrytest` AS select `tbltest`.`SubscriberID` AS `SubscriberID`,sum(`tbltest`.`Amount`) AS `Total` from `tbltest` group by `tbltest`.`SubscriberID` character_set_client: latin1 collation_connection: latin1_swedish_ci 1 row in set (0,00 sec) and as sum() is used server is forced to use emporary table to materialize the view, as explained at http://dev.mysql.com/doc/refman/5.1/en/view-algorithms.html: mysql> EXPLAIN SELECT * FROM qryTEST WHERE SubscriberID=12345\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 402400 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DERIVED table: tbltest type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 524288 Extra: Using temporary; Using filesort 2 rows in set (20,09 sec) It would be nice to remove this limitation one day and push condition down to the SELECT defining view, so this is a reasonable feature request.
[3 Mar 2009 18:56]
John Kounis
Thank you for the clarification. I read the link you provided and it's clear why this is happening. If it would affect the priority of the feature request, perhaps I could give a real-world example that I'm sure is common for database applications: I have many views to provide information about client transactions that use either aggregate functions or limit functions. A typical monthly invoice run looks something like this: SELECT CustomerID, <and a bunch of other stuff> FROM Customers INNER JOIN qryThisMonthsBalance ON Customer.ID = qryThisMonthsBalance.CustomerID INNER JOIN qryLastMonthsBalance ON Customer.ID = qryLastMonthsBalance.CustomerID INNER JOIN qryLastPayment on Customer.ID = qryLastPayment.CustomerID INNER JOIN qryLastInvoiceSent on Customer.ID = qryLastInvoiceSent.CustomerID ... and so on.... The performance of a query like the one above became unacceptable after converting from MSSQL to MySQL, since each of the views on which I was doing a join was doing a full table scan of 1.2 million records instead of using the "CustomerID" index. To restore normal performance, I had to write stored procedures to create temporary tables instead of using views as in the example above.