Bug #79919 LOAD XML inheritance can fail when loading multiple tables
Submitted: 11 Jan 2016 19:02 Modified: 12 Jan 2016 9:12
Reporter: Matthew Shaw Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:5.6 OS:Windows
Assigned to: CPU Architecture:Any
Tags: XML

[11 Jan 2016 19:02] Matthew Shaw
Description:
LOAD XML is designed to be able to populate multiple tables. It is supposed to be able to inherit field values from a superior tag level, in order that foreign keys can be populated.  In the circumstances described below, it fails to do so.

How to repeat:
The objective is to populate two tables: node and node_tag.  Here is the XML data (from Open Street Map):

<?xml version='1.0' encoding='UTF-8'?>
	<node id="30603856" lat="47.1707728" lon="9.5171301" version="10" timestamp="2013-09-26T15:25:42Z" changeset="18046619" uid="1747645" user="VCLI"/>
	<node id="26860698" lat="47.0862971" lon="9.5270956" version="2" timestamp="2011-10-03T12:34:35Z" changeset="9459790" uid="7532" user="jennergruhle">
		<tag k="name" v="Mittagspitze"/>
		<tag k="tourism" v="camp_site"/>
	</node>

Node 30603856 has no tags, whereas 26860698 does.

The target tables are defined as follows:

CREATE TABLE `node` (
  `idNode` bigint(20) NOT NULL,
  `Visible` int(11) DEFAULT NULL,
  `Version` int(11) DEFAULT NULL,
  `Changeset` bigint(20) DEFAULT NULL,
  `Timestamp` timestamp NULL DEFAULT NULL,
  `UID` int(11) DEFAULT NULL,
  `Latitude` decimal(12,8) DEFAULT NULL,
  `Longitude` decimal(12,8) DEFAULT NULL,
  `geom` geometry DEFAULT NULL,
  PRIMARY KEY (`idNode`),
  UNIQUE KEY `idNode_UNIQUE` (`idNode`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `node_tag` (
  `idNode_Tag` bigint(20) NOT NULL AUTO_INCREMENT,
  `idNode` bigint(20) NOT NULL,
  `TagKey` varchar(255) NOT NULL,
  `TagValue` varchar(255) NOT NULL,
  PRIMARY KEY (`idNode_Tag`),
  UNIQUE KEY `idNode_Tag_UNIQUE` (`idNode_Tag`),
  KEY `ix_nodeid` (`idNode`),
  KEY `ix_nodekey` (`TagKey`),
  KEY `ix_nodevalue` (`TagValue`)
) ENGINE=InnoDB AUTO_INCREMENT=199 DEFAULT CHARSET=utf8;

The LOAD XML commands are:

LOAD XML LOCAL INFILE 'osm.osm'
    INTO TABLE node
	CHARACTER SET 'UTF8'		
    ROWS IDENTIFIED BY '<node>'
	(@id, @lat, @lon, @version, @timestamp, @changeset, @uid, @user)
	SET idNode=@id, latitude=@lat, longitude=@lon, version=@version, timestamp=@timestamp, changeset=@changeset, uid=@uid;

LOAD XML LOCAL INFILE 'osm.osm'
       INTO TABLE node_tag 
	   CHARACTER SET 'UTF8'	
       ROWS IDENTIFIED BY '<tag>'
	   (@id, @k, @v)
	   SET idNode=@id, TagKey=@k, tagValue=@v;	
	
NODE loads correctly.  However NODE_TAG loads as follows:

# idNode_Tag, idNode, TagKey, TagValue
196, 0, name, Mittagspitze
197, 0, tourism, camp_site

Notice how idNode is set to "0" when it should be 26860698.

A small edit to the first XML record results in the correct answers:

# idNode_Tag, idNode, TagKey, TagValue
199, 26860698, name, Mittagspitze
200, 26860698, tourism, camp_site

The required edit is to change how the XML is terminated i.e:

<node ...  user="VCLI"/> 
to...
<node ...  user="VCLI"></node>

Suggested fix:
Unfortunately, it is very difficult to manually edit the XML to overcome the bug, so it is marked as "Serious".
[12 Jan 2016 9:12] MySQL Verification Team
Hello Matthew,

Thank you for the report and test case.
I'm not seeing this issue at my end on 5.6.28/5.5.48. Could you please confirm exact MySQl version used? If you can provide more information, feel free to add it to this bug and change the status back to 'Open'.

Thank you for your interest in MySQL.

-- 5.6.28
[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.6.28: bin/mysql -uroot -S /tmp/mysql_ushastry.sock  test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.28-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like '%version%';
+-------------------------+---------------------------------------------------------+
| Variable_name           | Value                                                   |
+-------------------------+---------------------------------------------------------+
| innodb_version          | 5.6.28                                                  |
| protocol_version        | 10                                                      |
| slave_type_conversions  |                                                         |
| version                 | 5.6.28-enterprise-commercial-advanced                   |
| version_comment         | MySQL Enterprise Server - Advanced Edition (Commercial) |
| version_compile_machine | x86_64                                                  |
| version_compile_os      | linux-glibc2.5                                          |
+-------------------------+---------------------------------------------------------+
7 rows in set (0.00 sec)

mysql>
mysql> drop table if exists node;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE `node` (
    ->   `idNode` bigint(20) NOT NULL,
    ->   `Visible` int(11) DEFAULT NULL,
    ->   `Version` int(11) DEFAULT NULL,
    ->   `Changeset` bigint(20) DEFAULT NULL,
    ->   `Timestamp` timestamp NULL DEFAULT NULL,
    ->   `UID` int(11) DEFAULT NULL,
    ->   `Latitude` decimal(12,8) DEFAULT NULL,
    ->   `Longitude` decimal(12,8) DEFAULT NULL,
    ->   `geom` geometry DEFAULT NULL,
    ->   PRIMARY KEY (`idNode`),
    ->   UNIQUE KEY `idNode_UNIQUE` (`idNode`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)

mysql> LOAD XML LOCAL INFILE '/tmp/bug.xml' INTO TABLE node CHARACTER SET 'UTF8' ROWS IDENTIFIED BY '<node>' (@id, @lat, @lon, @version, @timestamp, @changeset, @uid, @user) SET idNode=@id, latitude=@lat, longitude=@lon, version=@version, timestamp=@timestamp, changeset=@changeset, uid=@uid;
Query OK, 2 rows affected, 2 warnings (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 2

mysql>
mysql> select * from node;
+----------+---------+---------+-----------+---------------------+---------+-------------+------------+------+
| idNode   | Visible | Version | Changeset | Timestamp           | UID     | Latitude    | Longitude  | geom |
+----------+---------+---------+-----------+---------------------+---------+-------------+------------+------+
| 26860698 |    NULL |       2 |   9459790 | 2011-10-03 12:34:35 |    7532 | 47.08629710 | 9.52709560 | NULL |
| 30603856 |    NULL |      10 |  18046619 | 2013-09-26 15:25:42 | 1747645 | 47.17077280 | 9.51713010 | NULL |
+----------+---------+---------+-----------+---------------------+---------+-------------+------------+------+
2 rows in set (0.00 sec)

mysql>
mysql> drop table if exists node_tag;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE `node_tag` (
    ->   `idNode_Tag` bigint(20) NOT NULL AUTO_INCREMENT,
    ->   `idNode` bigint(20) NOT NULL,
    ->   `TagKey` varchar(255) NOT NULL,
    ->   `TagValue` varchar(255) NOT NULL,
    ->   PRIMARY KEY (`idNode_Tag`),
    ->   UNIQUE KEY `idNode_Tag_UNIQUE` (`idNode_Tag`),
    ->   KEY `ix_nodeid` (`idNode`),
    ->   KEY `ix_nodekey` (`TagKey`),
    ->   KEY `ix_nodevalue` (`TagValue`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=199 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)

mysql> LOAD XML LOCAL INFILE '/tmp/bug.xml' INTO TABLE node_tag CHARACTER SET 'UTF8'ROWS IDENTIFIED BY '<tag>' (@id, @k, @v) SET idNode=@id, TagKey=@k, tagValue=@v;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select * from node_tag;
+------------+----------+---------+--------------+
| idNode_Tag | idNode   | TagKey  | TagValue     |
+------------+----------+---------+--------------+
|        199 | 26860698 | name    | Mittagspitze |
|        200 | 26860698 | tourism | camp_site    |
+------------+----------+---------+--------------+
2 rows in set (0.00 sec)

Thanks,
Umesh
[12 Jan 2016 9:13] MySQL Verification Team
Please see, most likely duplicate of https://bugs.mysql.com/bug.php?id=67542