| Bug #23637 | Charset and collation of the function SPACE() | ||
|---|---|---|---|
| Submitted: | 25 Oct 2006 17:21 | Modified: | 11 May 2012 17:30 | 
| Reporter: | Marc ALFF | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) | 
| Version: | 5.0 | OS: | Any | 
| Assigned to: | Alexander Barkov | CPU Architecture: | Any | 
   [26 Oct 2006 8:14]
   Sveta Smirnova        
  Thank you for the report. Verified as described on Linux using last BK sources.
   [2 Apr 2007 17:10]
   Marc ALFF        
  See related (?) bug#27580
   [20 Feb 2008 19:35]
   Omer Barnir        
  workaround: use explicit collation on the results
   [11 May 2012 17:30]
   Paul DuBois        
  Noted in 5.6.4 changelog. Collation for the SPACE() function was determined by the parse time value of the collation_connection system variable (instead of the runtime value), which could give unexpected results from prepared statements, triggers, and stored procedures.

Description: The charset and collation of the string returned by the function SPACE() is not deterministic when this function is used inside a stored procedure, a stored function or a trigger. While similar issues exist with strings, this report is entered to cover SPACE() specifically, because of the current implementation of create_func_space() depends at *parsing* time on current_thd->variables.collation_connection This is considered a "design defect", see the suggested fix. How to repeat: drop table if exists t1; create table t1(a int); set names 'UTF8'; delimiter |; create trigger t1_b1 before insert on t1 for each row begin set @a = collation(space(2)); set @b = collation(" "); end| delimiter ;| insert into t1 values (1); select @a, @b; set names 'latin1'; insert into t1 values (2); ## Reports utf8, but because it's cached in memory (see below) select @a, @b; flush tables; insert into t1 values (3); ## Reports latin1_swedish, expecting utf8 select @a, @b; Suggested fix: In create_func_space() in item_create.cc, expanding SPACE() to a *different* Item tree at parse time is a bad idea, since it prevents sharing code (the same Item tree) between threads at runtime. A dedicated Item_func_space seems better suited, with the logic that depends on the charset / collation located in the runtime implementation only (and therefore not exposed to the parser)