Bug #67542 LOAD XML does not handle empty elements
Submitted: 9 Nov 2012 17:39 Modified: 18 Aug 2015 16:43
Reporter: Kolbe Kegel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.5.28, 5.7.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: XML

[9 Nov 2012 17:39] Kolbe Kegel
Description:
The LOAD XML statement is completely unable to handle empty XML elements of the form <element/> or <element />. Additionally, the behavior (very wrong in both cases) is different depending on whether there is a space between the element name and the slash.

How to repeat:
DROP TABLE IF EXISTS xml;
CREATE TABLE `xml` (
  `col1` char(3) DEFAULT NULL,
  `col2` char(3) DEFAULT NULL,
  `col3` char(3) DEFAULT NULL
);

#shell cat > /tmp/test1.xml
<row>
    <col1>abc</col1>
    <col2>def</col2>
    <col3>ghi</col3>
</row>
<row>
    <col1>jkl</col1>
    <col2 />
    <col3>mno</col3>
</row>

truncate xml;
load xml infile '/tmp/test1.xml' into table xml;
select * from xml;

#shell cat > /tmp/test2.xml
<row>
    <col1>abc</col1>
    <col2>def</col2>
    <col3>ghi</col3>
</row>
<row>
    <col1>jkl</col1>
    <col2/>
    <col3>mno</col3>
</row>

truncate xml;
load xml infile '/tmp/test2.xml' into table xml;
select * from xml;

mysql 5.5.28-log (root) [test]> DROP TABLE IF EXISTS xml;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql 5.5.28-log (root) [test]> CREATE TABLE `xml` (
    ->   `col1` char(3) DEFAULT NULL,
    ->   `col2` char(3) DEFAULT NULL,
    ->   `col3` char(3) DEFAULT NULL
    -> );
Query OK, 0 rows affected (0.06 sec)

kolbe@prosimmon ~ $ cat /tmp/test1.xml 
<row>
    <col1>abc</col1>
    <col2>def</col2>
    <col3>ghi</col3>
</row>
<row>
    <col1>jkl</col1>
    <col2 />
    <col3>mno</col3>
</row>

mysql 5.5.28-log (root) [test]> truncate xml;
Query OK, 0 rows affected (0.00 sec)

mysql 5.5.28-log (root) [test]> load xml infile '/tmp/test1.xml' into table xml;
Query OK, 2 rows affected (0.01 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql 5.5.28-log (root) [test]> select * from xml;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| abc  | def  | ghi  |
| jkl  | mno  | NULL |
+------+------+------+
2 rows in set (0.00 sec)

kolbe@prosimmon ~ $ cat /tmp/test2.xml 
<row>
    <col1>abc</col1>
    <col2>def</col2>
    <col3>ghi</col3>
</row>
<row>
    <col1>jkl</col1>
    <col2/>
    <col3>mno</col3>
</row>

mysql 5.5.28-log (root) [test]> truncate xml;
Query OK, 0 rows affected (0.00 sec)

mysql 5.5.28-log (root) [test]> load xml infile '/tmp/test2.xml' into table xml;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql 5.5.28-log (root) [test]> select * from xml;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| abc  | def  | ghi  |
| NULL | NULL | mno  |
+------+------+------+
2 rows in set (0.00 sec)

Suggested fix:
LOAD XML should correctly handle empty XML elements. It should certainly not handle them differently depending on whether there is a space between the element name and the slash.
[9 Nov 2012 18:18] Sveta Smirnova
Thank you for the report.

Verified as described.
[13 Nov 2012 18:51] Sveta Smirnova
Bug #67439 was marked as duplicate of this one.
[4 Mar 2015 9:52] Daniƫl van Eeden
This looks like a duplicate of:
Bug #66299 	Import xml with short tags into mysql
[30 Jul 2015 4:46] Jenny Bourne
For me, short tags <tagname\> (without space) only affect the preceding tags from the start of the record.

OS = Win7  MySQL version 5.4 & 5.6
Test data:
<?xml version="1.0" encoding="utf-8"?>
<sc_response status="ok">
<sc_data>
	<t>2015-07-29 23:04:23</t>
	<ip>2.223.2.181</ip>
	<ip_label/>
	<browser_name>Safari</browser_name>
</sc_data>
<sc_data>
	<t>2015-07-29 23:04:19</t>
	<ip/>
	<ip_label>An ip label</ip_label>
	<browser_name>Safari</browser_name>
</sc_data>
<sc_data>
	<t/>
	<ip>142.134.37.64</ip>
	<ip_label>An ip label</ip_label>
	<browser_name>Safari</browser_name>
</sc_data>
<sc_data>
	<t/>
	<ip>142.134.37.64</ip>
	<ip_label/>
	<browser_name>Safari</browser_name>
</sc_data>
</sc_response>

Query:
LOAD XML LOCAL INFILE '/wamp/www/stats/downloads/TestXML.xml'
    REPLACE
    INTO TABLE visits2
    CHARACTER SET utf8
    ROWS IDENTIFIED BY '<sc_data>';

Result - note for the fourth record the ip_label tag behaved as expected
+------+---------------+-------------+--------------+
| t    | ip            | ip_label    | browser_name |
+------+---------------+-------------+--------------+
| NULL | NULL          | NULL        | Safari       |
| NULL | NULL          | An ip label | Safari       |
| NULL | 142.134.37.64 | An ip label | Safari       |
| NULL | 142.134.37.64 | NULL        | Safari       |
+------+---------------+-------------+--------------+
[18 Aug 2015 16:43] Jon Stephens
Fixed in MySQL 5.5.46, 5.6.27, 5.7.9. Documented in the corresponding changelogs as follows:

    Empty XML elements having the form <element/> were not handled
    correctly by the LOAD XML statement.

Closed.