Bug #1439 UDF Corruption
Submitted: 30 Sep 2003 5:46 Modified: 19 Mar 2004 2:38
Reporter: Matthew Yonkovit Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: User-defined functions ( UDF ) Severity:S2 (Serious)
Version:4.1.1 OS:2.4.20 Suse 8.2
Assigned to: Sergei Glukhov CPU Architecture:Any

[30 Sep 2003 5:46] Matthew Yonkovit
Description:
We have about a dozen UDF running in 4.0 without any problems.  After uprading to 4.1.1 we are experiencing random corruption in the function definition. Calling a UDF, we get an invalid SQL Statement.  IF we try to drop the function
we get something similar to this: "#1128 - Function '@yŒ	' is not defined".  If we bring down the database and restart the mysqld Daemon the function works correctly without any problems.  

 Most of our calls are through the mysql++ API (1.7.9) from a C++ application we have. We have been able to reproduce the bug on Redhat 8.0 and Suse 8.2.  We have also been able to reproduce this using both the provided MAX binaries and the Source Distro.  We have compiled the source distro with gcc 3.2 and 3.3 with the same results.  We are currently using the compiled source from cvs with the following options:
./configure --prefix=/usr/local/mysql --enable-assembler
--with-mysqld-ldflags=-rdynamic --datadir=/data/mysql --with-mysqld-use

Since the problem occurs and is fixed by downing mysqld, i have been adjusting the memory setting in hopes to stave off the problem.  This has not worked.  

Here is the progression of how I have resolved the function curroption ( if you can call downing mysqld resolving ):

#1 : Function fails.

#2 : DROP FUNCTION convertxp

MySQL said:

#1128 - Function '@yŒ	' is not defined

#3 checking the table results in an ok.

CHECK TABLE `func`
Table 	           Op 	Msg_type 	Msg_text
mysql.func 	check 	status      	OK

#4 select * from func returns convertxp as valid function.  

#5 if I delete from func where name = 'convertxp', then recreate the function
everything works again... or if I bring down mysqld and restart everything is working again.

FYI: I ammended this bug onto another bug (Bug #1403) I opened, but the original bug (crashing mysqld) seems to have cleared up after upgrading linux.

How to repeat:
Fails after several thousand calls to a complex UDF.  Depending on the time of day this could be several hours between problems.  We have only experienced this making calls from the MYsql++ interface so far, but our use of these functions elsewhere is limited.
[30 Sep 2003 15:50] Matthew Yonkovit
Ok, I figured out what I think is going on.  UDF's are not thread safe.  99 times out of 100 the functions are returning data so quickly it does not matter, but their is that chance that it will be active at the same time.  Our app calls the UDF functions hudnred of thousands of times over an hour.  Our UI also hits against the same function.  Once and a while they colide.  

This can be tested easily enough on 4.1.x.  
Any of our functions cause the problem, here is one:

char *poslookup(UDF_INIT *initid, UDF_ARGS *args, char *result,
               unsigned long *length, char *is_null, char *error);

char *poslookup(UDF_INIT *initid, UDF_ARGS *args, char *result,
               unsigned long *length, char *is_null, char *error)
{
  int val;
  switch (args->arg_type[0]) {
     case STRING_RESULT:
        *length = 0;
        return result;
     break;
     case INT_RESULT:
        val = *((longlong*) args->args[0]);
     break;
     case REAL_RESULT:
        val = (longlong) *((double*) args->args[0]);
     break;
  }
  *length = 0;
  switch(val)
  {
    case 1:
      strcpy(result,"P");
      *length = 1;
    break;
    case 2:
      strcpy(result,"C");
      *length = 1;
    break;
    case 3:
      strcpy(result,"1B");
      *length = 2;
    break;
    case 4:
      strcpy(result,"2B");
      *length = 2;
    break;
    case 5:
      strcpy(result,"3B");
      *length = 2;
    break;
    case 6:
      strcpy(result,"SS");
      *length = 2;
    break;
    case 7:
      strcpy(result,"LF");
      *length = 2;
    break;
    case 8:
      strcpy(result,"CF");
      *length = 2;
    break;
    case 9:
      strcpy(result,"RF");
      *length = 2;
    break;
    default:
      strcpy(result,"DH");
      *length = 2;
    break;
  }
  return result;
}

Then create a sql script that call select poslookup(2); something like 50,000 times.  Open 2 different client sessions and run.  If you do this two or three times this will eventually break ( this is what I have seen anyway ).  You should be able to do this with pretty much any function out their.  I could see this being a huge bug is any one is trying to multithread any huge number crunching apps that take advantage of UDF's.
[30 Sep 2003 15:56] Matthew Yonkovit
By the way, did something change in the way that UDF's are being handled?  I checked in the changelog and did not see anything recent. Prior to 4.1 these seemed to work fine, or maybe we just got very lucky.
[6 Oct 2003 1:42] Sergei Golubchik
do you have SMP box ?

I wasn't able to repeat the problem on my single-cpu FreeBSD box.

please provide more information about your environment for us to try
it on something similar.
[6 Oct 2003 5:11] Matthew Yonkovit
- Dual 1GHZ AMD Athlons (SMP) running on an asus A7M266-D
- 768MB of Ram
- occured with both SUSE 8.2 and Redhat 8.0, I also can repeat it by making calls from mysql++ api, the mysql client, and through php....
[17 Oct 2003 3:04] Sergei Golubchik
Indrek, could you please test it on one of our SMP boxes ?
[26 Feb 2004 15:52] MySQL Verification Team
According our talk in IRC.

Thanks.
[16 Mar 2004 12:32] Timothy Smith
I verified this on our dual proc Opteron machine, using 4.1.1-alpha.  It can easily be repeated by:

melody> export LD_LIBRARY_PATH=$HOME/tsmith/udf
melody> cd ~/tsmith/41
melody> gdb --args bin/mysqld --no-defaults --big-tables --socket=mysql.sock  --skip-networking --skip-innodb --basedir=$PWD
...
(gdb) run

and in another terminal:

melody> cd ~/tsmith/41
melody> for i in 1 2 3 4 5; do (mysql -S data/mysql.sock test < skr.sql > /dev/null &); done
ERROR 1064 at line 18: You have an error in your SQL syntax.  Check the manual that corresponds to your MySQL server version for the right syntax to use near '(2), poslookup(2), poslookup(2), poslookup(2), poslookup(2), p

Trying to drop the function gives an error, but the function works when you restart mysqld.

I have checked that this does not fail under 4.0.18.
[18 Mar 2004 22:48] Sergei Glukhov
Fix on dev-bugs
Subj: Fix for bug #1439 UDF Corruption(and some code cleanup)
Date: 2004-03-19
[19 Mar 2004 2:38] Sergei Glukhov
Fixed in source tree