Description:
Significantly improve performance of simple functions
We are using InnoBD on MySQL 5.7.21.
We have also tested this on MySQL 5.7.18 EE and we get the same results.
When running the SELECT query with the simple function in-lined in the how to repeat section the query stats show that the execution time is 0.015 seconds.
When running the SELECT using the simple function directly the query stats show that the execution time is 1.125 seconds.
The results are identical and the visual execution plan is completely identical.
Creating the table with 100,000 rows can take some time, however it is easy to repeat this with any query that returns lots of results, so any existing table could be used.
We have seen other reports asking for performance enhancements for all procedures and functions.
Here we are requesting something much simpler.
Whenever a FUNCTION is used in a SELECT clause and the FUNCTION just contains one RETURN statement for MySQL to inline that return statement.
In our example the id parameter would be replaced with t.groupID and then query can be simply in-lined.
Some care would need to be taken with renaming table aliases so that there are no conflicts. Also in some cases the inline query might need to be converted to the return type of the FUNCTION (for functional equivalence).
This is similar to what MySQL already does with VIEWs when they are using the merge algorithm.
Obviously the queries in this example could be rewritten to avoid using functions and even subqueries, however we have many much more complicated functions that can be rewritten to just be a simple RETURN statement.
In many of our cases the performance differences are even more extreme than shown in this simple example.
Our only workarounds currently are:
* manually in-lining all of the functions where we are noticing performance issues which is making many queries much harder to read, and if ever we do need to change the functionality of a function then we need to search through and change it all over the place
* changing the queries to use VIEWs instead, this is also proving problematic as if these views end up using the temporary table algorithm for whatever reason then performance can be extremely slow
How to repeat:
DROP TABLE IF EXISTS temperature;
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
PRIMARY KEY (`id`)
);
DROP PROCEDURE IF EXISTS temporaryProcedurePopulateTable;
DELIMITER $$
CREATE PROCEDURE temporaryProcedurePopulateTable()
BEGIN
DECLARE counter INT DEFAULT 0;
WHILE counter < 100000 DO
INSERT INTO temperature(groupID, recordedTimestamp, recordedValue) VALUES(floor(rand() * 100), now(), floor(rand() * 40));
SET counter = counter + 1;
END WHILE;
END $$
DELIMITER ;
CALL temporaryProcedurePopulateTable();
DROP TABLE IF EXISTS temperature_group;
CREATE TABLE temperature_group (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name varchar(10) NOT NULL,
PRIMARY KEY (id)
)
SELECT DISTINCT t.groupID AS id, concat('Group ', t.groupID) AS name FROM temperature t;
DROP FUNCTION IF EXISTS getGroupName;
DELIMITER $$
CREATE FUNCTION getGroupName(id INT UNSIGNED) RETURNS VARCHAR(10) DETERMINISTIC
BEGIN
RETURN (SELECT g.name FROM temperature_group g WHERE g.id = id);
END $$
DELIMITER ;
SELECT t.id, (SELECT g.name FROM temperature_group g WHERE g.id = t.groupID) AS groupName, t.recordedTimestamp, t.recordedValue FROM temperature t ORDER BY t.id;
SELECT t.id, getGroupName(t.groupID) AS groupName, t.recordedTimestamp, t.recordedValue FROM temperature t ORDER BY t.id;
Suggested fix:
Whenever a FUNCTION is used in a SELECT clause and the FUNCTION just contains one RETURN statement for MySQL to inline that return statement.