Bug #33546 No or incomplete optimization for constant UDFs
Submitted: 28 Dec 2007 1:35 Modified: 11 Dec 2010 17:49
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.22, 5.1.37-bzr OS:Any
Assigned to: Evgeny Potemkin CPU Architecture:Any

[28 Dec 2007 1:35] Roland Bouman
Description:
UDFs can hint the server that they will return a constant value by setting the cont_item member of the UDF_INIT structure passed to the init function to 1. 

Doing so does affect the execution plan (ref = Const appears in plan), however it can be shown that the UDF is still called or each row. 

This is not necessary. A UDF with cont_item equals to 1 should be called only once, and that result should be cached and used instead.

In fact, it is baffling why the explain plan does indicate that a references is made to a constant, whereas it can be demostrated that the UDF is still called for each row. See:

How to repeat:
The code is attached as comment.

mysql> explain  select * from t where c = udf_calling_sequence(c);
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t     | index | NULL          | c    | 5       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

(c is passed, this sets const_item to false)

mysql> explain  select * from t where c = udf_calling_sequence('');
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra                    |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
|  1 | SIMPLE      | t     | ref  | c             | c    | 5       | const |    2 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

('' is passed, this sets const_item to true)

mysql> select * from t where c = udf_calling_sequence('');

Actual calling sequences are the same however - for each row in t

Suggested fix:
Please call the UDF only once if const_item is TRUE.
[4 Jan 2008 1:26] MySQL Verification Team
Thank you for the bug report. I compiled your sample code and the functions
exported are:

Dump of file lib_mysqludf_udf.dll

File Type: DLL

  Section contains the following exports for lib_mysqludf_udf.dll

    00000000 characteristics
    477D93B3 time date stamp Thu Jan 03 23:02:27 2008
        0.00 version
           1 ordinal base
          36 number of functions
          36 number of names

    ordinal hint RVA      name

          1    0 00001030 lib_mysqludf_udf_info
          2    1 00001260 lib_mysqludf_udf_info_deinit
          3    2 00001000 lib_mysqludf_udf_info_init
          4    3 00001270 udf_arg_attribute
          5    4 00001260 udf_arg_attribute_deinit
          6    5 00001230 udf_arg_attribute_init
          7    6 000012C0 udf_arg_attribute_length
          8    7 00001260 udf_arg_attribute_length_deinit
          9    8 00001290 udf_arg_attribute_length_init
         10    9 00001060 udf_arg_count
         11    A 00001260 udf_arg_count_deinit
         12    B 000012E0 udf_arg_count_init
         13    C 00001220 udf_arg_maybe_null
         14    D 00001260 udf_arg_maybe_null_deinit
         15    E 000011F0 udf_arg_maybe_null_init
         16    F 000010A0 udf_arg_type
         17   10 00001260 udf_arg_type_deinit
         18   11 00001070 udf_arg_type_init
         19   12 00001190 udf_arg_value
         20   13 00001260 udf_arg_value_deinit
         21   14 00001160 udf_arg_value_init
         22   15 000010F0 udf_arg_value_is_constant
         23   16 00001260 udf_arg_value_is_constant_deinit
         24   17 000010B0 udf_arg_value_is_constant_init
         25   18 00001140 udf_arg_value_is_null
         26   19 00001260 udf_arg_value_is_null_deinit
         27   1A 00001110 udf_arg_value_is_null_init
         28   1B 000011E0 udf_arg_value_length
         29   1C 00001260 udf_arg_value_length_deinit
         30   1D 000011B0 udf_arg_value_length_init
         31   1E 000012F0 udf_initid_decimals
         32   1F 00001260 udf_initid_decimals_deinit
         33   20 000012E0 udf_initid_decimals_init
         34   21 000012D0 udf_initid_maybe_null
         35   22 00001260 udf_initid_maybe_null_deinit
         36   23 000012E0 udf_initid_maybe_null_init

  Summary

        2000 .data
        2000 .rdata
        1000 .reloc
        7000 .text

C:\dbs\5.1\lib>

and I didn't see the function mentioned on your test case udf_calling_sequence
and also could you please complete the test case with the dump of the table.
Thanks in advance.
[4 Jan 2008 22:35] Roland Bouman
My apologies miguel - I think I attached the wrong source. Please find a new source attached - ignore the old one

Here is the SQL script, assuming the lib is called libudf_const.so:

use test;
create function udf_const returns integer soname 'libudf_const.so';
create table t(c int primary key);
insert into t values (1),(2),(3),(4),(5);

ok - now for the results. Remember, udf_const sets initid->const_item to 1  in the init function, and returns the value of initid->const_item in the row level function.

#case 1

select * from t where udf_const(1) = c;

in the error log we see this:

enter udf_const_init
udf_const row level function
udf_const row level function
enter udf_const_deinit

(note that the row level function is called twice - once should be enough)

Now look at the explain to verify that the optimizer recognizes this as a constant:

+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t     | const | PRIMARY       | PRIMARY | 4       | const |    1 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+

#case 2:

select * from t where udf_const(c) = c;

In the error log, we find this calling sequence:

enter udf_const_init
udf_const row level function
udf_const row level function
udf_const row level function
udf_const row level function
udf_const row level function
enter udf_const_deinit

So, the function is called for each row! This must be wrong.

It gets even stranger. The result set that is returned is identical to the set returned by the prior statement using the udf_const(1) call:

+---+
| c |
+---+
| 1 |
+---+
1 row in set (0.00 sec)

Note that this is the correct result - udf_const returns 1, (should always so anyway)

But now look at the explain! 

explain select * from t where udf_const(c) = c;
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t     | index | NULL          | PRIMARY | 4       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+

As you can see, udf_const is suddenly not treated as a constant expression. So there is some hidden logic that looks at the arguments passed to the UDF, which incorrectly assuming that if a column is passed, the result can never be constaint - even though the UDF clearly says that it will return a constant value!!
[4 Jan 2008 22:36] Roland Bouman
udf_const - demonstrates odd optimizer behavour w/r initid->const_item

Attachment: udf_const.c (text/x-csrc), 1.34 KiB.

[11 Feb 2008 19:37] Thomas Johnson
I was using 5.0.36sp1 and was going to upgrade to 5.0.54a.  But I found a performance issue with a critical application.  It may be related to this bug, so I'm adding it as a comment to this "Bug".  

My Query is:
select l.name, a.zip
from pls.locations l, pls.address a
where l.id = GetSite()
and l.address = a.id

With 36sp1, it returns in 0.00 seconsds.  With 54a, it can take up to 5.08 seconds.  I found it’s not using the PRIMARY key on my locations table.

Here is the explain plan for 54a:
mysql> EXPLAIN EXTENDED select l.name, a.zip
    -> from pls.locations l, pls.address a
    -> where l.id = GetSite()
    -> and l.address = a.id;
+----+-------------+-------+------+----------------------+----------------------
+---------+----------+-------+-------------+
| id | select_type | table | type | possible_keys        | key
| key_len | ref      | rows  | Extra       |
+----+-------------+-------+------+----------------------+----------------------
+---------+----------+-------+-------------+
|  1 | SIMPLE      | a     | ALL  | PRIMARY              | NULL
| NULL    | NULL     | 39031 |             |
|  1 | SIMPLE      | l     | ref  | FK_locations_address | FK_locations_address
| 4       | pls.a.id |     1 | Using where |
+----+-------------+-------+------+----------------------+----------------------
+---------+----------+-------+-------------+
2 rows in set, 1 warning (0.00 sec)

Here is the Explain plan for 36sp1:
mysql> EXPLAIN EXTENDED select l.name, a.zip
    -> from pls.locations l, pls.address a
    -> where l.id = GetSite()
    -> and l.address = a.id;
+----+-------------+-------+-------+------------------------------+---------+---
------+-------+------+-------+
| id | select_type | table | type  | possible_keys                | key     | ke
y_len | ref   | rows | Extra |
+----+-------------+-------+-------+------------------------------+---------+---
------+-------+------+-------+
|  1 | SIMPLE      | l     | const | PRIMARY,FK_locations_address | PRIMARY | 4
      | const |    1 |       |
|  1 | SIMPLE      | a     | const | PRIMARY                      | PRIMARY | 4
      | const |    1 |       |
+----+-------------+-------+-------+------------------------------+---------+---
------+-------+------+-------+
2 rows in set, 1 warning (0.47 sec)

I use UDF’s all the time to return a single value from the DB.  Sometimes the UDF's I write may use parameter(s) which are based on a data field from the row.  In this case I’m selecting a field from a table with 1 row that is used to contain local site (static) info, kind of like an INI file.
	Select site_id from this_site;

I can easily change the query, but I already have 40+ applications written with over 300+ SQL statements that are affected by this optimize “bug”.  So I prefer not to rewrite everything.

I even tried using:
select l.name, a.zip
from pls.locations l force index(PRIMARY), pls.address a
where l.id = GetSite()
and l.address = a.id;

But it still wouldn’t optimize it.  For me, this is stopping us from upgrading to the latest server.

I hope this is appropriate and related.
Thanks,
Tom J.
[11 Feb 2008 19:52] Roland Bouman
Hi Thomas, 

when you say UDF, do you mean "user-defined function" as in 

http://dev.mysql.com/doc/refman/5.0/en/create-function.html

or do you mean "stored sql function" as in

http://dev.mysql.com/doc/refman/5.0/en/stored-procedures.html
[11 Feb 2008 23:15] Thomas Johnson
Sorry. I just realized there was a difference and was comming back to clarify.
I meant "stored sql function".  So it may be different, but I'm having similar issues.
[5 Jul 2009 17:50] Valeriy Kravchuk
Verified just as described with recent 5.1.37 on Mac OS X:

mysql> explain select * from t where udf_const(1) = c;
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | t     | const | PRIMARY       | PRIMARY | 4       | const |    1 | Using index | 
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.01 sec)

mysql> explain select * from t where udf_const(c) = c;
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t     | index | NULL          | PRIMARY | 4       | NULL |    5 | Using where; Using index | 
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
[13 Nov 2009 15:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90370

3202 Evgeny Potemkin	2009-11-13
      Bug#33546: Slowdown on re-evaluation of constant expressions.
      
      Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated
      for each row. This causes slowdown of query execution especially if constant
      UDF/SP function are used.
      
      Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with
      help of the compile function. When analyzer meets a constant item it
      sets a flag for the tree transformer to cache the item and doesn't allow tree
      walker to go deeper. Thus, the topmost item of a constant expression if
      cached. This is done after all other optimizations were applied to
      WHERE/HAVING/ON expressions
      
      A helper function called cache_const_exprs is added to the JOIN class.
      It calls compile method with caching analyzer and transformer on WHERE,
      HAVING, ON expressions if they're present.
      The cache_const_expr_analyzer and cache_const_expr_transformer functions are
      added to the Item class. The first one check if the item can be cached and
      the second caches it if so.
      A new Item_cache_datetime class is derived from the Item_cache class.
      It caches both int and string values of the underlying item independently to
      avoid DATETIME aware int-to-string conversion. Thus it completely relies on
      the ability of the underlying item to correctly convert DATETIME value from
      int to string and vice versa.
     @ mysql-test/r/func_like.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/select.result
        Added a test case for the bug#33546.
     @ mysql-test/r/subselect.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/t/select.test
        Added a test case for the bug#33546.
     @ sql/item.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class. The first one check if the item can be cached and
        the second caches it if so.
        Item_cache_datetime class implementation is added.
     @ sql/item.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        Item_ref and Item_cache classes now returns basic_const_item
        from underlying item.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class.
     @ sql/item_cmpfunc.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        
        The Item_cond::compile function now is PS-safe.
     @ sql/sql_select.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        
        A helper function called cache_const_exprs is added to the JOIN class.
        It calls compile method with caching analyzer and transformer on WHERE,
        HAVING, ON expressions if they're present.
     @ sql/sql_select.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        A helper function called cache_const_exprs is added to the JOIN class.
[23 Nov 2009 13:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/91306

3202 Evgeny Potemkin	2009-11-23
      Bug#33546: Slowdown on re-evaluation of constant expressions.
      
      Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated
      for each row. This causes slowdown of query execution especially if constant
      UDF/SP function are used.
      
      Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with
      help of the compile function. When analyzer meets a constant item it
      sets a flag for the tree transformer to cache the item and doesn't allow tree
      walker to go deeper. Thus, the topmost item of a constant expression if
      cached. This is done after all other optimizations were applied to
      WHERE/HAVING/ON expressions
      
      A helper function called cache_const_exprs is added to the JOIN class.
      It calls compile method with caching analyzer and transformer on WHERE,
      HAVING, ON expressions if they're present.
      The cache_const_expr_analyzer and cache_const_expr_transformer functions are
      added to the Item class. The first one check if the item can be cached and
      the second caches it if so.
      A new Item_cache_datetime class is derived from the Item_cache class.
      It caches both int and string values of the underlying item independently to
      avoid DATETIME aware int-to-string conversion. Thus it completely relies on
      the ability of the underlying item to correctly convert DATETIME value from
      int to string and vice versa.
     @ mysql-test/r/func_like.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/func_time.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/select.result
        Added a test case for the bug#33546.
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/subselect.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/udf.result
        Added a test case for the bug#33546.
     @ mysql-test/t/select.test
        Added a test case for the bug#33546.
        A test case is corrected after fixing bug#33546.
     @ mysql-test/t/udf.test
        Added a test case for the bug#33546.
     @ sql/item.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class. The first one check if the item can be cached and
        the second caches it if so.
        Item_cache_datetime class implementation is added.
     @ sql/item.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        Item_ref and Item_cache classes now returns basic_const_item
        from underlying item.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class.
     @ sql/sql_select.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        
        A helper function called cache_const_exprs is added to the JOIN class.
        It calls compile method with caching analyzer and transformer on WHERE,
        HAVING, ON expressions if they're present.
     @ sql/sql_select.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        A helper function called cache_const_exprs is added to the JOIN class.
[24 Nov 2009 8:59] Sergei Golubchik
see also bug#47839 and bug#41153
[1 Dec 2009 21:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/92360

2912 Evgeny Potemkin	2009-12-02
      Bug#33546: Slowdown on re-evaluation of constant expressions.
            
      Constant expressions in WHERE/HAVING/ON clauses aren't cached and evaluated
      for each row. This causes slowdown of query execution especially if constant
      UDF/SP function are used.
            
      Now WHERE/HAVING/ON expressions are analyzed in the top-bottom direction with
      help of the compile function. When analyzer meets a constant item it
      sets a flag for the tree transformer to cache the item and doesn't allow tree
      walker to go deeper. Thus, the topmost item of a constant expression if
      cached. This is done after all other optimizations were applied to
      WHERE/HAVING/ON expressions
            
      A helper function called cache_const_exprs is added to the JOIN class.
      It calls compile method with caching analyzer and transformer on WHERE,
      HAVING, ON expressions if they're present.
      The cache_const_expr_analyzer and cache_const_expr_transformer functions are
      added to the Item class. The first one check if the item can be cached and
      the second caches it if so.
      A new Item_cache_datetime class is derived from the Item_cache class.
      It caches both int and string values of the underlying item independently to
      avoid DATETIME aware int-to-string conversion. Thus it completely relies on
      the ability of the underlying item to correctly convert DATETIME value from
      int to string and vice versa.
     @ mysql-test/r/func_like.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/func_time.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/select.result
        Added a test case for the bug#33546.
     @ mysql-test/r/subselect.result
        A test case result is corrected after fixing bug#33546.
     @ mysql-test/r/udf.result
        Added a test case for the bug#33546.
     @ mysql-test/t/select.test
        Added a test case for the bug#33546.
     @ mysql-test/t/udf.test
        Added a test case for the bug#33546.
     @ sql/item.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class. The first one check if the item can be cached and
        the second caches it if so.
        Item_cache_datetime class implementation is added.
     @ sql/item.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        Item_ref and Item_cache classes now returns basic_const_item
        from underlying item.
        The cache_const_expr_analyzer and cache_const_expr_transformer functions are
        added to the Item class.
     @ sql/sql_select.cc
        Bug#33546: Slowdown on re-evaluation of constant expressions.
                
        A helper function called cache_const_exprs is added to the JOIN class.
        It calls compile method with caching analyzer and transformer on WHERE,
        HAVING, ON expressions if they're present.
     @ sql/sql_select.h
        Bug#33546: Slowdown on re-evaluation of constant expressions.
        A helper function called cache_const_exprs is added to the JOIN class.
[11 Dec 2009 6:00] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091211055901-yp18b3c7xuhl87rf) (version source revid:alik@sun.com-20091211055401-43rjwq7gjed6ds83) (merge vers: 6.0.14-alpha) (pib:13)
[11 Dec 2009 6:04] Bugs System
Pushed into 5.6.0-beta (revid:alik@sun.com-20091211055628-ltr7fero363uev7r) (version source revid:alik@sun.com-20091211055453-717czhtezc74u8db) (merge vers: 5.6.0-beta) (pib:13)
[11 Dec 2009 10:47] Lenz Grimmer
Is this commit available from any public tree on Launchpad already?
If not, can this be done? Thanks!
[15 Dec 2009 17:22] Paul DuBois
Noted in 5.6.0, 6.0.14 changelogs.

Constant expressions in WHERE, HAVING, or ON clauses were not cached,
but were evaluated for each row. This caused a slowdown of query 
execution, especially if constant user-defined functions or stored
functions were used.
[16 Dec 2009 14:00] Dennis Schoen
Here's the corresponding commit in the 6.0 tree:

http://bazaar.launchpad.net/~mysql/mysql-server/mysql-6.0-codebase-bugfixing/revision/3746
[6 Mar 2010 10:55] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20091211201717-03qf8ckwiw0np80p) (merge vers: 5.6.0-beta) (pib:16)
[6 Mar 2010 20:21] Paul DuBois
Moved 5.6.0 changelog entry to 5.5.3.
[23 Jul 2010 12:24] Bugs System
Pushed into mysql-trunk 5.5.6-m3 (revid:alik@sun.com-20100723121820-jryu2fuw3pc53q9w) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:18)
[23 Jul 2010 12:31] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100723121929-90e9zemk3jkr2ocy) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (pib:18)
[29 Sep 2010 16:13] Øystein Grøvlen
The fix for this bug seems to have introduced the regression reported in Bug#57039.
[5 Dec 2010 12:38] Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 22:27] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)