Bug #80076 "SELECT... INTO OUTFILE ... CHARACTER SET ucs2" writes non-standard <eol> chars
Submitted: 20 Jan 2016 13:55 Modified: 21 Jan 2016 6:01
Reporter: Yura Sorokin (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.5.47, 5.6.28, 5.7.10 OS:Any
Assigned to: CPU Architecture:Any
Tags: ucs2

[20 Jan 2016 13:55] Yura Sorokin
Description:
"SELECT... INTO OUTFILE ... CHARACTER SET ucs2" generates a file with single-byte '0A' end-of-line characters instead of standard-conformant '00 0A' 16-bit code units.

How to repeat:
Run the following code fragment in MTR environment

****************************************************************************
--let OUTFILE= $MYSQLTEST_VARDIR/tmp/out1.txt

--eval SELECT '00' UNION SELECT '10' INTO OUTFILE '$OUTFILE' CHARACTER SET ucs2

perl;
  use strict;
  use warnings;
  my $filename = $ENV{'OUTFILE'};
  my $size = -s $filename;
  print "file size: $size\n"
EOF

--remove_file $OUTFILE
****************************************************************************

Output:
****************************************************************************
SELECT '00' UNION SELECT '10' INTO OUTFILE '/home/yura/ws/mysql-build/mysql-test/var/tmp/out1.txt' CHARACTER SET ucs2;
file size: 10
****************************************************************************

The size of the output file is 10 bytes whereas the expected value is 12:
1st line - 2 bytes for the first '0', 2 bytes for the second '0', 2 bytes for <eol>
2nd line - 2 bytes for '1', 2 bytes '0', 2 bytes for <eol>

Although this test was run on Linux, the problem may exist on Windows as well. There, the proper line endings would be '00 0D 00 0A' rather than '0D 0A'.

Suggested fix:
"SELECT... INTO OUTFILE ... CHARACTER SET ucs2" must always write proper <eol> characters to the generated file according to the UCS2 standard.
[21 Jan 2016 6:01] Umesh Shastry
Hello Yura,

Thank you for the report.

Thanks,
Umesh