Bug #53088 mysqldump with -T & --default-character-set set truncates text/blob to 766 chars
Submitted: 22 Apr 2010 22:07 Modified: 14 Oct 2010 14:48
Reporter: Ryan Mack Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: mysqldump Command-line Client Severity:S2 (Serious)
Version:5.1.45 OS:Any
Assigned to: Gleb Shchepa CPU Architecture:Any

[22 Apr 2010 22:07] Ryan Mack
Description:
The change to mysqldump to support the --default-character-set option (http://bugs.mysql.com/bug.php?id=30946) truncates all converted text or blob columns to MAX_FIELD_WIDTH (766) bytes.  I am not sure whether it is desirable to convert binary format columns to the default charset at all.

I consider this a very serious bug because customers depending on mysqldump for backups may be generating incomplete backups.

How to repeat:
CREATE TABLE t1 (a BLOB) CHARSET latin1;

CREATE TABLE t2 LIKE t1;

INSERT INTO t1 VALUES ('<800 characters>');

SELECT * INTO OUTFILE '<file>' /*!50138 CHARACTER SET latin1 */ FROM t1;

LOAD DATA INFILE '<file>' INTO TABLE t2;

SELECT * FROM t1;

SELECT * FROM t2;

Suggested fix:
Proper fix:  Ignore blob fields inside select_export::send_data() conversion code.  Allocate sufficient space for entire text fields or rewrite code to work on long text fields in pieces.

Quick fix: Rollback commit 79790.
[22 Apr 2010 22:16] Ryan Mack
Sorry, to clarify, the truncation only occurs when using --default-character-set along with tabular output.
[23 Apr 2010 0:17] Ryan Mack
Here is a patch that includes a test case and one possible fix.  This skips conversions on blobs and handles large text fields.

Attachment: bug53088-2.patch (application/octet-stream, text), 16.51 KiB.

[23 Apr 2010 12:26] MySQL Verification Team
Verified as described with 5.1.40sp1 on x86_64.
[27 Apr 2010 16:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106707

3359 Gleb Shchepa	2010-04-27
      Bug #53088: mysqldump with -T & --default-character-set set 
                  truncates text/blob to 766 chars 
       
      mysqldump and SELECT ... INTO OUTFILE truncated long BLOB/TEXT 
      values to size of 766 bytes (MAX_FIELD_WIDTH or 255 * 3 + 1). 
                                                                                                           
      The select_export::send_data method has been modified to                                             
      reallocate a conversion buffer for long field data.    
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set                                          
                    truncates text/blob to 766 chars                                                         
                                                                                                             
        The select_export::send_data method has been modified to                                             
        reallocate a conversion buffer for long field data.
[28 Apr 2010 8:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106759

3359 Gleb Shchepa	2010-04-28
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE silently truncated
      long BLOB/TEXT values to the size of 766 bytes
      (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[28 Apr 2010 8:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106766

3359 Gleb Shchepa	2010-04-28
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE silently truncated
      long BLOB/TEXT values to the size of 766 bytes
      (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[28 Apr 2010 19:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106866

3359 Gleb Shchepa	2010-04-28
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE silently truncated
      long BLOB/TEXT values to the size of 766 bytes
      (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[29 Apr 2010 14:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106951

3359 Gleb Shchepa	2010-04-29
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE silently truncated
      long BLOB/TEXT values to the size of 766 bytes
      (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[29 Apr 2010 14:19] Alexander Barkov
http://lists.mysql.com/commits/106951 is Ok to push.
[29 Apr 2010 15:36] Ryan Mack
If someone has the time, could they explain to me what the effect of converting a binary charset column to any other charset would be and why it would be desirable?  We run some of our backups with -T --default-character-set=latin1 and I am wondering if the conversion form binary to latin1 may cause corruption when the data is reloaded.  In my local patch I explicitly made binary->non-binary charset conversion a NOOP to better match the pre-5.1.31 behavior, but that doesn't appear to be part of the official patch.  I am reluctant to convert to the official patch without understanding the behavior better.

Thank you, Ryan
[29 Apr 2010 17:56] Gleb Shchepa
Ryan,

> [29 Apr 17:36] Ryan Mack

> If someone has the time, could they explain to me what
> the effect of converting a binary charset column to any
> other charset would be and why it would be desirable?

Yes, usually conversion of binary data is unnecessary,
however if we try to save binary column into a file
of the UCS2, UTF16 or UTF32 charset (minimal character
lengths are 2, 2 and 4 respectively), we have to do
some padding, otherwise the resulting output may be
unreadable.

MySQL 5.1 doesn't support OUTFILE/mysqldump output to these
charsets, this conversion is for future use.
In any case, it doesn't affect dumping to the latin1 charset.
[6 May 2010 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107614

3359 Gleb Shchepa	2010-05-06
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE silently truncated
      long BLOB/TEXT values to the size of 766 bytes
      (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[6 May 2010 20:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/107691

3371 Gleb Shchepa	2010-05-07
      Bug #53088: mysqldump with -T & --default-character-set set
                  truncates text/blob to 766 chars
      
      mysqldump and SELECT ... INTO OUTFILE truncated long BLOB/TEXT
      values to size of 766 bytes (MAX_FIELD_WIDTH or 255 * 3 + 1).
      
      The select_export::send_data method has been modified to
      reallocate a conversion buffer for long field data.
     @ mysql-test/r/mysqldump.result
        Test case for bug #53088.
     @ mysql-test/r/outfile_loaddata.result
        Test case for bug #53088.
     @ mysql-test/t/mysqldump.test
        Test case for bug #53088.
     @ mysql-test/t/outfile_loaddata.test
        Test case for bug #53088.
     @ sql/sql_class.cc
        Bug #53088: mysqldump with -T & --default-character-set set
                    truncates text/blob to 766 chars
        
        The select_export::send_data method has been modified to
        reallocate a conversion buffer for long field data.
[28 May 2010 5:53] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:alik@sun.com-20100512070920-xgpmqeytp0gc183c) (pib:16)
[28 May 2010 6:22] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:alik@sun.com-20100514054548-91z72f0mcskr84kj) (merge vers: 6.0.14-alpha) (pib:16)
[28 May 2010 6:50] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:alexey.kopytov@sun.com-20100508220335-xsvmtj21h4yeu8mf) (merge vers: 5.5.5-m3) (pib:16)
[2 Jun 2010 8:50] Bugs System
Pushed into 5.1.48 (revid:georgi.kodinov@oracle.com-20100602084411-2yu607bslbmgufl3) (version source revid:martin.hansson@sun.com-20100507071216-mxvljum43kqu3q0u) (merge vers: 5.1.47) (pib:16)
[4 Jun 2010 1:47] Paul DuBois
Noted in 5.1.48, 5.5.5, 6.0.14 changelogs.
[14 Oct 2010 8:28] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:vasil.dimov@oracle.com-20100513074652-0cvlhgkesgbb2bfh) (merge vers: 5.5.5-m3) (pib:21)
[14 Oct 2010 8:42] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:vasil.dimov@oracle.com-20100513074652-0cvlhgkesgbb2bfh) (merge vers: 5.5.5-m3) (pib:21)
[14 Oct 2010 8:57] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:vasil.dimov@oracle.com-20100513074652-0cvlhgkesgbb2bfh) (merge vers: 5.5.5-m3) (pib:21)
[14 Oct 2010 14:48] Jon Stephens
Already documented in the 5.1.48 changelog; no new changelog entries required. setting back to Closed state.