Bug #77067 problem with char(160) in functions concat and concat_ws under Version 5.6.23
Submitted: 18 May 2015 9:51 Modified: 26 May 2015 15:01
Reporter: Hartmut Mehnert Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Prepared statements Severity:S1 (Critical)
Version:5.6.23 OS:Other (debian-linux)
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: WBBugReporter

[18 May 2015 9:51] Hartmut Mehnert
Description:
I use some queries, views, functions and prcedures with char(160) in functions concat() or concat_ws(). Under MySQL Server 5.5 with default charset latin1 it works sucessfull.
Under MySQL Server 5.6.23 with default charset utf8 concat() and concat_ws() with char(160) in functions and procedures are not sucessfull. The resultstring is truncatet at the position of char(160) or char(160) was ignorred. Other characters as char(9) or char(8199) are sucessfull. In queries char(160) is also sucessfull.

It is very importend vor us because wie wont to change our produktion system from Version 5.5 to 5.6 and wie use char(160) inter alia in a function to group IBAN with non-break spaces (DE83207500000000735407 --> DE83 2075 0000 0000 7354 07).

How to repeat:
Example:

DELIMITER ///
DROP FUNCTION IF EXISTS CHAR160_test///
CREATE DEFINER=`h.mehnert`@`%` FUNCTION `CHAR160_test`(IN_chars varchar(3)) RETURNS text CHARSET utf8
BEGIN
	declare proz_return TEXT;

	select concat_ws('', 'begin', IN_chars, 'end') into proz_return;

RETURN proz_return;
END
///

All results in viewer:

Server 5.6.23:

select CHAR160_test(char(9)); --> 'begin	end'
select CHAR160_test(char(32)); --> 'begin end'
select CHAR160_test(char(160)); --> 'beginend'
select CHAR160_test(char(8199)); --> 'begin BELend'
select concat_ws('','begin',char(160),'end'); --> 'begin end'

Server 5.5.37:

select CHAR160_test(char(9)); --> 'begin	end'
select CHAR160_test(char(32)); --> 'begin end'
select CHAR160_test(char(160)); --> 'begin end'
select CHAR160_test(char(8199)); --> 'begin BELend'
select concat_ws('','begin',char(160),'end'); --> ''begin end''
[18 May 2015 11:49] Peter Laursen
The diferent chrset setting may make it necessary to execute "SET NAMES latin1" (if the client s designed for latin1/ANSI) in order to make the server return data encoded in such a way that the client understands it.  

char(160) returns a one-byte result in latin1 and a two-byte result in utf8.
 
Now if you wrap your query in a HEX() like "SELECT HEX(CONCAT_WS(...));" you will easier be able to see what happens and where it happens. This will probably return identical results on 5.5 and 5.6 as it will only return characters (0..9,A..F) that have the same encoding in latin1/ANSI and utf8. If it does (as I expect) it is all a client-side problem you are facing (client is designed for one specific charset). Solution is to SET NAMES.

-- Peter
-- not a MySQL/Oracle person
[19 May 2015 10:13] Hartmut Mehnert
Hallo Peter, many thanks for your answer.

My problem with char(160) ocurs not in queries in editor, but if i use it in procedures or functions with parameters or variables defined as varchar or text. Only char(160) is not accepted for varchar-parameter, but for varbinary. Even char(8199) with four bytes is accepted for varchar and works rigth.
In version 5.5 this trouble not occurs but in 5.6.
It seams 5.5 convert char(160) with collation binary automatical to varchar on the parameter with the standardcollation latin1_swedish_ci.
5.6 convert also char(...) to varchar with standardcollation utf8_general_ci, only char(160) not.
[19 May 2015 12:00] MySQL Verification Team
Hi Hartmut,

Are you saying that you have problem with CHAR(160) but not with CHAR(159) nor CHAR(161) ?

all best
Bogdan Kecman
[19 May 2015 13:25] Hartmut Mehnert
I have tested that the problem is with all CHAR(...) from 128 to 255.
Meanwhile i found that i have to use CHAR(49824) in CHARSET utf8 for non-break space instead CHAR(160). In utf8 CHAR(160) is not a allowed character.
The conversion from 5.5 latin1 to 5.6 utf8 has changed all CHAR(160) #A0 in tablecontents to CHAR(49824) #C2A0.
I have to change the code in my functions and procedures.
[26 May 2015 14:43] MySQL Verification Team
[arhimed@gedora sandboxes]$ cat b77067.sql ; ./use_all "source /home/arhimed/sandboxes/b77067.sql"
USE test;
set names 'utf8';

DELIMITER ///
DROP FUNCTION IF EXISTS CHAR160_test///
CREATE FUNCTION `CHAR160_test`(IN_chars varchar(3)) RETURNS text CHARSET utf8
BEGIN
        declare proz_return TEXT;

        select concat_ws('', '>', IN_chars, '<') into proz_return;

RETURN proz_return;
END
///

DELIMITER ;
select @@version;
select CHAR160_test( char( 9 ) );
select CHAR160_test( char( 32 ) );
select CHAR160_test( char( 160 ) );
select HEX( CHAR160_test( char( 160 ) ));
select CHAR160_test( char( 8199 ) );
select concat_ws('','>', char( 160 ), '<');
select HEX(concat_ws('','>', char( 160 ), '<'));
select HEX(CHAR(160)), HEX('>'), HEX('<');

@@version
5.1.73
CHAR160_test( char( 9 ) )
>\t<
CHAR160_test( char( 32 ) )
> <
CHAR160_test( char( 160 ) )
> <
HEX( CHAR160_test( char( 160 ) ))
3EC2A03C
CHAR160_test( char( 8199 ) )
> <
concat_ws('','>', char( 160 ), '<')
>▒<
HEX(concat_ws('','>', char( 160 ), '<'))
3EA03C
HEX(CHAR(160))  HEX('>')        HEX('<')
A0      3E      3C

@@version
5.5.43-log
CHAR160_test( char( 9 ) )
>\t<
CHAR160_test( char( 32 ) )
> <
CHAR160_test( char( 160 ) )
> <
HEX( CHAR160_test( char( 160 ) ))
3EC2A03C
CHAR160_test( char( 8199 ) )
> <
concat_ws('','>', char( 160 ), '<')
>▒<
HEX(concat_ws('','>', char( 160 ), '<'))
3EA03C
HEX(CHAR(160))  HEX('>')        HEX('<')
A0      3E      3C

@@version
5.6.24
CHAR160_test( char( 9 ) )
>\t<
CHAR160_test( char( 32 ) )
> <
CHAR160_test( char( 160 ) )
> <
HEX( CHAR160_test( char( 160 ) ))
3EC2A03C
CHAR160_test( char( 8199 ) )
> <
concat_ws('','>', char( 160 ), '<')
>▒<
HEX(concat_ws('','>', char( 160 ), '<'))
3EA03C
HEX(CHAR(160))  HEX('>')        HEX('<')
A0      3E      3C

@@version
5.7.7-rc
CHAR160_test( char( 9 ) )
>\t<
CHAR160_test( char( 32 ) )
> <
CHAR160_test( char( 160 ) )
> <
HEX( CHAR160_test( char( 160 ) ))
3EC2A03C
CHAR160_test( char( 8199 ) )
> <
concat_ws('','>', char( 160 ), '<')
>▒<
HEX(concat_ws('','>', char( 160 ), '<'))
3EA03C
HEX(CHAR(160))  HEX('>')        HEX('<')
A0      3E      3C
[arhimed@gedora sandboxes]$
[26 May 2015 15:01] MySQL Verification Team
Hi,

U+00A0 (non breakable space) character as UTF8 is encoded as C2A0. 
The CHAR(A0) itself is not a valid UTF8 character.
Your function returns UTF8 so it has to return a valid UTF8 result.

If you look results of your function when you send CHAR(A0) in all latest releases of 5.1, 5.5, 5.6, 5.7 output is properly C2A0. How are you showing this data in your mysql client depends on the way you use your client and character set of your OS. If you use linux:

# set your terminal to be UTF8
$ export LANG=en_US.UTF-8
# start mysql
$ mysql test
> -- tell mysql to use utf8 to communicate with you
> set names 'utf8';
> -- execute your queries:
> select HEX( CHAR160_test( char( 160 ) ));
+-----------------------------------+
| HEX( CHAR160_test( char( 160 ) )) |
+-----------------------------------+
| 3EC2A03C                          |
+-----------------------------------+
1 row in set (0.00 sec)

> select CHAR160_test( char( 160 ) );
+-----------------------------+
| CHAR160_test( char( 160 ) ) |
+-----------------------------+
| > <                         |
+-----------------------------+
1 row in set (0.00 sec)