Bug #33546 | No or incomplete optimization for constant UDFs | ||
---|---|---|---|
Submitted: | 28 Dec 2007 1:35 | Modified: | 11 Dec 2010 17:49 |
Reporter: | Roland Bouman | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.1.22, 5.1.37-bzr | OS: | Any |
Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
[28 Dec 2007 1:35]
Roland Bouman
[4 Jan 2008 1:26]
MySQL Verification Team
Thank you for the bug report. I compiled your sample code and the functions exported are: Dump of file lib_mysqludf_udf.dll File Type: DLL Section contains the following exports for lib_mysqludf_udf.dll 00000000 characteristics 477D93B3 time date stamp Thu Jan 03 23:02:27 2008 0.00 version 1 ordinal base 36 number of functions 36 number of names ordinal hint RVA name 1 0 00001030 lib_mysqludf_udf_info 2 1 00001260 lib_mysqludf_udf_info_deinit 3 2 00001000 lib_mysqludf_udf_info_init 4 3 00001270 udf_arg_attribute 5 4 00001260 udf_arg_attribute_deinit 6 5 00001230 udf_arg_attribute_init 7 6 000012C0 udf_arg_attribute_length 8 7 00001260 udf_arg_attribute_length_deinit 9 8 00001290 udf_arg_attribute_length_init 10 9 00001060 udf_arg_count 11 A 00001260 udf_arg_count_deinit 12 B 000012E0 udf_arg_count_init 13 C 00001220 udf_arg_maybe_null 14 D 00001260 udf_arg_maybe_null_deinit 15 E 000011F0 udf_arg_maybe_null_init 16 F 000010A0 udf_arg_type 17 10 00001260 udf_arg_type_deinit 18 11 00001070 udf_arg_type_init 19 12 00001190 udf_arg_value 20 13 00001260 udf_arg_value_deinit 21 14 00001160 udf_arg_value_init 22 15 000010F0 udf_arg_value_is_constant 23 16 00001260 udf_arg_value_is_constant_deinit 24 17 000010B0 udf_arg_value_is_constant_init 25 18 00001140 udf_arg_value_is_null 26 19 00001260 udf_arg_value_is_null_deinit 27 1A 00001110 udf_arg_value_is_null_init 28 1B 000011E0 udf_arg_value_length 29 1C 00001260 udf_arg_value_length_deinit 30 1D 000011B0 udf_arg_value_length_init 31 1E 000012F0 udf_initid_decimals 32 1F 00001260 udf_initid_decimals_deinit 33 20 000012E0 udf_initid_decimals_init 34 21 000012D0 udf_initid_maybe_null 35 22 00001260 udf_initid_maybe_null_deinit 36 23 000012E0 udf_initid_maybe_null_init Summary 2000 .data 2000 .rdata 1000 .reloc 7000 .text C:\dbs\5.1\lib> and I didn't see the function mentioned on your test case udf_calling_sequence and also could you please complete the test case with the dump of the table. Thanks in advance.
[4 Jan 2008 22:35]
Roland Bouman
My apologies miguel - I think I attached the wrong source. Please find a new source attached - ignore the old one Here is the SQL script, assuming the lib is called libudf_const.so: use test; create function udf_const returns integer soname 'libudf_const.so'; create table t(c int primary key); insert into t values (1),(2),(3),(4),(5); ok - now for the results. Remember, udf_const sets initid->const_item to 1 in the init function, and returns the value of initid->const_item in the row level function. #case 1 select * from t where udf_const(1) = c; in the error log we see this: enter udf_const_init udf_const row level function udf_const row level function enter udf_const_deinit (note that the row level function is called twice - once should be enough) Now look at the explain to verify that the optimizer recognizes this as a constant: +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | t | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ #case 2: select * from t where udf_const(c) = c; In the error log, we find this calling sequence: enter udf_const_init udf_const row level function udf_const row level function udf_const row level function udf_const row level function udf_const row level function enter udf_const_deinit So, the function is called for each row! This must be wrong. It gets even stranger. The result set that is returned is identical to the set returned by the prior statement using the udf_const(1) call: +---+ | c | +---+ | 1 | +---+ 1 row in set (0.00 sec) Note that this is the correct result - udf_const returns 1, (should always so anyway) But now look at the explain! explain select * from t where udf_const(c) = c; +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | 1 | SIMPLE | t | index | NULL | PRIMARY | 4 | NULL | 5 | Using where; Using index | +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ As you can see, udf_const is suddenly not treated as a constant expression. So there is some hidden logic that looks at the arguments passed to the UDF, which incorrectly assuming that if a column is passed, the result can never be constaint - even though the UDF clearly says that it will return a constant value!!
[4 Jan 2008 22:36]
Roland Bouman
udf_const - demonstrates odd optimizer behavour w/r initid->const_item
Attachment: udf_const.c (text/x-csrc), 1.34 KiB.
[11 Feb 2008 19:37]
Thomas Johnson
I was using 5.0.36sp1 and was going to upgrade to 5.0.54a. But I found a performance issue with a critical application. It may be related to this bug, so I'm adding it as a comment to this "Bug". My Query is: select l.name, a.zip from pls.locations l, pls.address a where l.id = GetSite() and l.address = a.id With 36sp1, it returns in 0.00 seconsds. With 54a, it can take up to 5.08 seconds. I found it’s not using the PRIMARY key on my locations table. Here is the explain plan for 54a: mysql> EXPLAIN EXTENDED select l.name, a.zip -> from pls.locations l, pls.address a -> where l.id = GetSite() -> and l.address = a.id; +----+-------------+-------+------+----------------------+---------------------- +---------+----------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+----------------------+---------------------- +---------+----------+-------+-------------+ | 1 | SIMPLE | a | ALL | PRIMARY | NULL | NULL | NULL | 39031 | | | 1 | SIMPLE | l | ref | FK_locations_address | FK_locations_address | 4 | pls.a.id | 1 | Using where | +----+-------------+-------+------+----------------------+---------------------- +---------+----------+-------+-------------+ 2 rows in set, 1 warning (0.00 sec) Here is the Explain plan for 36sp1: mysql> EXPLAIN EXTENDED select l.name, a.zip -> from pls.locations l, pls.address a -> where l.id = GetSite() -> and l.address = a.id; +----+-------------+-------+-------+------------------------------+---------+--- ------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | ke y_len | ref | rows | Extra | +----+-------------+-------+-------+------------------------------+---------+--- ------+-------+------+-------+ | 1 | SIMPLE | l | const | PRIMARY,FK_locations_address | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | a | const | PRIMARY | PRIMARY | 4 | const | 1 | | +----+-------------+-------+-------+------------------------------+---------+--- ------+-------+------+-------+ 2 rows in set, 1 warning (0.47 sec) I use UDF’s all the time to return a single value from the DB. Sometimes the UDF's I write may use parameter(s) which are based on a data field from the row. In this case I’m selecting a field from a table with 1 row that is used to contain local site (static) info, kind of like an INI file. Select site_id from this_site; I can easily change the query, but I already have 40+ applications written with over 300+ SQL statements that are affected by this optimize “bug”. So I prefer not to rewrite everything. I even tried using: select l.name, a.zip from pls.locations l force index(PRIMARY), pls.address a where l.id = GetSite() and l.address = a.id; But it still wouldn’t optimize it. For me, this is stopping us from upgrading to the latest server. I hope this is appropriate and related. Thanks, Tom J.
[11 Feb 2008 19:52]
Roland Bouman
Hi Thomas, when you say UDF, do you mean "user-defined function" as in http://dev.mysql.com/doc/refman/5.0/en/create-function.html or do you mean "stored sql function" as in http://dev.mysql.com/doc/refman/5.0/en/stored-procedures.html
[11 Feb 2008 23:15]
Thomas Johnson
Sorry. I just realized there was a difference and was comming back to clarify. I meant "stored sql function". So it may be different, but I'm having similar issues.
[5 Jul 2009 17:50]
Valeriy Kravchuk
Verified just as described with recent 5.1.37 on Mac OS X: mysql> explain select * from t where udf_const(1) = c; +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | t | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ 1 row in set (0.01 sec) mysql> explain select * from t where udf_const(c) = c; +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | 1 | SIMPLE | t | index | NULL | PRIMARY | 4 | NULL | 5 | Using where; Using index | +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ 1 row in set (0.00 sec)
[13 Nov 2009 15:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/90370 3202 Evgeny Potemkin 2009-11-13 Bug#33546: Slowdown on re-evaluation of constant expressions. Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated for each row. This causes slowdown of query execution especially if constant UDF/SP function are used. Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with help of the compile function. When analyzer meets a constant item it sets a flag for the tree transformer to cache the item and doesn't allow tree walker to go deeper. Thus, the topmost item of a constant expression if cached. This is done after all other optimizations were applied to WHERE/HAVING/ON expressions A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. A new Item_cache_datetime class is derived from the Item_cache class. It caches both int and string values of the underlying item independently to avoid DATETIME aware int-to-string conversion. Thus it completely relies on the ability of the underlying item to correctly convert DATETIME value from int to string and vice versa. @ mysql-test/r/func_like.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/select.result Added a test case for the bug#33546. @ mysql-test/r/subselect.result A test case result is corrected after fixing bug#33546. @ mysql-test/t/select.test Added a test case for the bug#33546. @ sql/item.cc Bug#33546: Slowdown on re-evaluation of constant expressions. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. Item_cache_datetime class implementation is added. @ sql/item.h Bug#33546: Slowdown on re-evaluation of constant expressions. Item_ref and Item_cache classes now returns basic_const_item from underlying item. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. @ sql/item_cmpfunc.cc Bug#33546: Slowdown on re-evaluation of constant expressions. The Item_cond::compile function now is PS-safe. @ sql/sql_select.cc Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. @ sql/sql_select.h Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class.
[23 Nov 2009 13:14]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/91306 3202 Evgeny Potemkin 2009-11-23 Bug#33546: Slowdown on re-evaluation of constant expressions. Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated for each row. This causes slowdown of query execution especially if constant UDF/SP function are used. Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with help of the compile function. When analyzer meets a constant item it sets a flag for the tree transformer to cache the item and doesn't allow tree walker to go deeper. Thus, the topmost item of a constant expression if cached. This is done after all other optimizations were applied to WHERE/HAVING/ON expressions A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. A new Item_cache_datetime class is derived from the Item_cache class. It caches both int and string values of the underlying item independently to avoid DATETIME aware int-to-string conversion. Thus it completely relies on the ability of the underlying item to correctly convert DATETIME value from int to string and vice versa. @ mysql-test/r/func_like.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/func_time.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/select.result Added a test case for the bug#33546. A test case result is corrected after fixing bug#33546. @ mysql-test/r/subselect.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/udf.result Added a test case for the bug#33546. @ mysql-test/t/select.test Added a test case for the bug#33546. A test case is corrected after fixing bug#33546. @ mysql-test/t/udf.test Added a test case for the bug#33546. @ sql/item.cc Bug#33546: Slowdown on re-evaluation of constant expressions. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. Item_cache_datetime class implementation is added. @ sql/item.h Bug#33546: Slowdown on re-evaluation of constant expressions. Item_ref and Item_cache classes now returns basic_const_item from underlying item. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. @ sql/sql_select.cc Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. @ sql/sql_select.h Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class.
[1 Dec 2009 21:25]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92360 2912 Evgeny Potemkin 2009-12-02 Bug#33546: Slowdown on re-evaluation of constant expressions. Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated for each row. This causes slowdown of query execution especially if constant UDF/SP function are used. Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with help of the compile function. When analyzer meets a constant item it sets a flag for the tree transformer to cache the item and doesn't allow tree walker to go deeper. Thus, the topmost item of a constant expression if cached. This is done after all other optimizations were applied to WHERE/HAVING/ON expressions A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. A new Item_cache_datetime class is derived from the Item_cache class. It caches both int and string values of the underlying item independently to avoid DATETIME aware int-to-string conversion. Thus it completely relies on the ability of the underlying item to correctly convert DATETIME value from int to string and vice versa. @ mysql-test/r/func_like.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/func_time.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/select.result Added a test case for the bug#33546. @ mysql-test/r/subselect.result A test case result is corrected after fixing bug#33546. @ mysql-test/r/udf.result Added a test case for the bug#33546. @ mysql-test/t/select.test Added a test case for the bug#33546. @ mysql-test/t/udf.test Added a test case for the bug#33546. @ sql/item.cc Bug#33546: Slowdown on re-evaluation of constant expressions. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. The first one check if the item can be cached and the second caches it if so. Item_cache_datetime class implementation is added. @ sql/item.h Bug#33546: Slowdown on re-evaluation of constant expressions. Item_ref and Item_cache classes now returns basic_const_item from underlying item. The cache_const_expr_analyzer and cache_const_expr_transformer functions are added to the Item class. @ sql/sql_select.cc Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class. It calls compile method with caching analyzer and transformer on WHERE, HAVING, ON expressions if they're present. @ sql/sql_select.h Bug#33546: Slowdown on re-evaluation of constant expressions. A helper function called cache_const_exprs is added to the JOIN class.
[11 Dec 2009 6:00]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091211055901-yp18b3c7xuhl87rf) (version source revid:alik@sun.com-20091211055401-43rjwq7gjed6ds83) (merge vers: 6.0.14-alpha) (pib:13)
[11 Dec 2009 6:04]
Bugs System
Pushed into 5.6.0-beta (revid:alik@sun.com-20091211055628-ltr7fero363uev7r) (version source revid:alik@sun.com-20091211055453-717czhtezc74u8db) (merge vers: 5.6.0-beta) (pib:13)
[11 Dec 2009 10:47]
Lenz Grimmer
Is this commit available from any public tree on Launchpad already? If not, can this be done? Thanks!
[15 Dec 2009 17:22]
Paul DuBois
Noted in 5.6.0, 6.0.14 changelogs. Constant expressions in WHERE, HAVING, or ON clauses were not cached, but were evaluated for each row. This caused a slowdown of query execution, especially if constant user-defined functions or stored functions were used.
[16 Dec 2009 14:00]
Dennis Schoen
Here's the corresponding commit in the 6.0 tree: http://bazaar.launchpad.net/~mysql/mysql-server/mysql-6.0-codebase-bugfixing/revision/3746
[6 Mar 2010 10:55]
Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20091211201717-03qf8ckwiw0np80p) (merge vers: 5.6.0-beta) (pib:16)
[6 Mar 2010 20:21]
Paul DuBois
Moved 5.6.0 changelog entry to 5.5.3.
[23 Jul 2010 12:24]
Bugs System
Pushed into mysql-trunk 5.5.6-m3 (revid:alik@sun.com-20100723121820-jryu2fuw3pc53q9w) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:18)
[23 Jul 2010 12:31]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100723121929-90e9zemk3jkr2ocy) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (pib:18)
[29 Sep 2010 16:13]
Øystein Grøvlen
The fix for this bug seems to have introduced the regression reported in Bug#57039.
[5 Dec 2010 12:38]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 22:27]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)