Bug #92491 XML export does not encoded ampersand, less-than, or double-quote characgters
Submitted: 19 Sep 2018 7:07 Modified: 19 Sep 2018 8:57
Reporter: Brad Gough Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Workbench: SQL Editor Severity:S3 (Non-critical)
Version:8.0.12, 8.0.17 OS:Windows
Assigned to: CPU Architecture:x86

[19 Sep 2018 7:07] Brad Gough
Description:
When exporting data from the result grid, special XML characters are not encoded during export.

Specifically:
& must be converted to &
< must be converted to &lt;
" must be converted to &quot; (when attributes are encoded in double-quotes)
' must be converted to &apos; (when attributes are encoded in single-quotes)
> can optionally be converted to &gt;

As a result, exported XML data cannot be loaded and processed using an XML parser when the exported data contains one or more &, <, or " characters.

How to repeat:
Step 1: Create test table using the following:

DROP TABLE IF EXISTS test;
CREATE TABLE test (test_column VARCHAR(255) NOT NULL);
INSERT INTO test 
VALUES ('& (ampersand) must be converted to \&amp;')
, ('< (less-than) must be converted to &lt;')
, ('" (double-quote) must be converted to &quot;')
, ('> (greater-than) can optionally be converted to &gt;')
, ('Attributes are encoded using double-quote therefore do not need to encode \' (apostrophe or single quote) with &apos;')

Step 2: Select all table records using: SELECT * FROM test
Step 3: In the result grid, click the export button
Step 4: Using the save as type drop down, select "XML (mysql format)"
Step 5: Save the file to any location with any name
Step 6: Open the file to see the following:
<?xml version="1.0"?>

<resultset statement="SELECT * FROM test
LIMIT 0, 50000
"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

	<row>
		<field name="test_column">& (ampersand) must be converted to &amp;</field>
	</row>

	<row>
		<field name="test_column">< (less-than) must be converted to &lt;</field>
	</row>

	<row>
		<field name="test_column">" (double-quote) must be converted to &quot;</field>
	</row>

	<row>
		<field name="test_column">> (greater-than) can optionally be converted to &gt;</field>
	</row>

	<row>
		<field name="test_column">Attributes are encoded using double-quote therefore do not need to encode ' (apostrophe or single quote) with &apos;</field>
	</row>
</resultset>

Suggested fix:
Update workbench to behave the same as mysql.exe with --xml option.

When mysql.exe is run with the --xml option:
& is converted to &amp;
< is converted to &lt;
" is converted to &quot;
> is converted to &gt;
' is NOT converted to &apos; which is not needed because attributes are enclosed in quotes.
[19 Sep 2018 8:57] MySQL Verification Team
Hello Brad Gough,

Thank you for the report!

regards,
Umesh
[25 Oct 2018 16:42] Joe Hufnagel
I've experienced the same problem, but wonder about, as an alternative to translating selected characters into entity references, wrap any text data in a CDATA block.  In my specific case, I have some results that contain embedded newlines which of course get lost without using CDATA.
[3 Oct 2019 12:00] MySQL Verification Team
Bug #96420 Marked as duplicate of this one.