Bug #5629 Collation information for columns not passed into UDF
Submitted: 17 Sep 2004 12:42 Modified: 20 Sep 2004 9:23
Reporter: Jose Miguel Pérez Ruiz Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: User-defined functions ( UDF ) Severity:S4 (Feature request)
Version:4.1.4-gamma OS:Linux (Red Hat 8.0)
Assigned to: CPU Architecture:Any

[17 Sep 2004 12:42] Jose Miguel Pérez Ruiz
Description:
Collation information (CHARSET_INFO) is not passed through to the user defined functions.
This is important, since some UDF may need to sort data, not only do calculations, or take some action depending on the charset.
The need to sorting is clearly seen on aggregate UDF. For example, think of a clone of the MAX() aggregate function for strings.

How to repeat:
for (;;) read_description_again();

;-) Seriusly, one can try to use the sortcmp function (which requires a CHARSET_INFO * as third argument) inside an UDF. If you try to use it, you must use a global default collation.

Suggested fix:
IF: there is a way to obtain the CHARSET_INFO pointer for each column, then the documentation must be updated, and the example upgraded to reflect this.

ELSE: I can think a possible fix for this. The best way (I think) for placing this information is on the "xxx_init" call for the UDF. There is a UDF_ARGS parameter which includes the parameters information. This structure may be updated as this:

--- Extract from "mysql_com.h" ----
typedef struct st_udf_args
{
  unsigned int arg_count;               /* Number of arguments */
  enum Item_result *arg_type;           /* Pointer to item_results */
  char **args;                          /* Pointer to argument */
  unsigned long *lengths;               /* Length of string arguments */
  char *maybe_null;                     /* Set to 1 for all maybe_null args */
  CHARSET_INFO **collation;             /* Collation information for args */
} UDF_ARGS;

If the argument is not a table column, the collation information may be a default one. If the argument is a table column, the collation information will be that of the column. This way, the UDF can store and act upon the collation information passed as show:

    sortcmp( string1, string2, collation_for_this_argument );