Bug #19745 mysqldump --xml produces invalid xml
Submitted: 11 May 2006 22:09 Modified: 13 Nov 2006 4:37
Reporter: Torrey Hoffman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.15 OS:Linux (Linux)
Assigned to: Iggy Galarza CPU Architecture:Any

[11 May 2006 22:09] Torrey Hoffman
Description:
mysqldump produces invalid xml when data contains blobs. It should  respect the hex-blob switch, but it is ignored.

Note that in the mysqldump output shown in the steps to reproduce, the data which appears as ??? in the text is actually the raw bytes 0xff00fef0.  This is not valid XML, and instead it should be dumped as a string containing "0xff00fef0" as the --hex-blob option specifies.

This is a serious problem, because (as you can see from the attempt to load the results into xmldiff) this invalid XML will be rejected by XML parsers, or even cause crashes.  So mysqldump --xml cannot be used if the database contains blobs.

How to repeat:

mysql> create table test (id int(10), data MEDIUMBLOB);
Query OK, 0 rows affected (0.40 sec)

mysql> insert into test VALUES(1,0xff00fef0);
Query OK, 1 row affected (0.24 sec)

mysql> select * from test;
+------+------+
| id   | data |
+------+------+
|    1 | ?    |
+------+------+
1 row in set (0.00 sec)

mysql> Bye

snitch:~ tommy$ mysqldump --xml --hex-blob test test
<?xml version="1.0"?>
<mysqldump xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<database name="test">
         <table_structure name="test">
                 <field Field="id" Type="int(10)" Null="YES" Key=""  
Extra="" />
                 <field Field="data" Type="mediumblob" Null="YES"  
Key="" Extra="" />
                 <options Name="test" Engine="MyISAM" Version="9"  
Row_format="Dynamic" Rows="1" Avg_row_length="20" Data_length="20"  
Max_data_length="4294967295" Index_length="1024" Data_free="0"  
Create_time="2006-05-11 10:41:00" Update_time="2006-05-11 10:41:44"  
Collation="latin1_swedish_ci" Create_options="" Comment="" />
         </table_structure>
         <table_data name="test">
         <row>
                 <field name="id">1</field>
                 <field name="data">???</field>
         </row>
         </table_data>
</database>
</mysqldump>

snitch:~ tommy$ mysqldump --xml --hex-blob test test > results

snitch:~ tommy$ xmldiff load results

xmldiff v0.2.5 - (c) 2004 - Remi Peyronnet - http://www.via.ecp.fr/ 
~remi/
XML Engine initialized.
terminate called after throwing an instance of 'XD_Exception'
   what():  results:12 : Input is not proper UTF-8, indicate encoding !
Bytes: 0xFF 0x00 0xFE 0xF0

Loading results as results... Abort trap

Suggested fix:
Have mysqldump --xml --hex-blob dump the contents of blobs as hex as described in the documentation, rather than raw bytes.
[12 May 2006 13:07] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version, 5.0.21, and inform about the results.
[12 Jun 2006 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[27 Jun 2006 12:52] Sergei Golubchik
I agree that the output is not well-formed xml.
But I'm not sure that it should respect --hex-blobs switch.
Perhaps a more xml-correct solution would be to use xml encoding for otherwise invalid bytes.
[18 Oct 2006 22:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/13921

ChangeSet@1.2295, 2006-10-18 18:43:51-04:00, iggy@rolltop.ignatz42.dyndns.org +3 -0
  Bug#19745: mysqldump --xml produces invalid xml
  
  The mysqldump command with both the --xml and --hex-blob options will output blob data encoded as hexBinary.  
  The proper XML datatype is xs:hexBinary.  
  The correct XML datatype is specified be setting the xsi_type attribute equal to xs:hexBinary for each encoded element.
[13 Nov 2006 4:37] Paul DuBois
Noted in 5.0.30 (not 5.0.29), 5.1.13 changelogs.

mysqldump --xml produced invalid XML for BLOB data.
[26 Mar 2008 11:38] George Lund
I wonder if there has been a regression because I am seeing this bug using mysqldump 5.0.33. Let me know if I should submit a new bug report, thanks...