Bug #86917 Command history omits LOAD XML when ROWS IDENTIFIED BY is specified
Submitted: 4 Jul 2017 3:56 Modified: 5 Jul 2017 5:16
Reporter: John Hawkinson Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:5.7.18 OS:Any (OS X 10.10.5)
Assigned to: CPU Architecture:Any
Tags: history

[4 Jul 2017 3:56] John Hawkinson
Description:
When a LOAD XML command is given that specifies the ROWS IDENTIFIED BY clause, it does not appear in the command history.

This seems very strange.

POTENTIAL BACKGROUND RE LINE EDITING LIBRARIES:
It appears this version of mysql is using libedit, but seems to be statically linked against both libreadline and libedit:

At the mysql> prompt, typing M-x history shows libedit-style history with commands run together and zeroes at the beginning, this is a libedit bug:

mysql>                                                                                                                                              : history
0 show databases;0 ...

But it seems to be linked against both libedit:

$ nm `type -p mysql` | c++filt | grep el_init
00000001000b1960 T _el_init

And also readline?:

$ nm `type -p mysql` | c++filt | grep rl_readline
0000000100424570 D _rl_readline_name
0000000100424568 D _rl_readline_version

go figure.

How to repeat:
This example requires a little setup to use LOAD XML:

It assumes test-o2.xml exists as:
$ cat test-o2.xml 
<rows><row f1="Lorem&#10;    ipsum &amp; dolor"/></rows>

and a table exists:
mysql> create table t (f1 varchar(255));

To reproduce:

mysql> LOAD XML LOCAL INFILE 'test-o2.xml' INTO TABLE t;
Query OK, 1 row affected (0.00 sec)
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

So far so good. Hit UP ARROW and edit the line:

mysql> LOAD XML LOCAL INFILE 'test-o2.xml' INTO TABLE t rows identified by '<row>';
Query OK, 1 row affected (0.00 sec)
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

Now hit UP ARROW again:

mysql> LOAD XML LOCAL INFILE 'test-o2.xml' INTO TABLE t; 

The second command has disappeared. Bizarre!
[4 Jul 2017 5:09] MySQL Verification Team
I guess this is due to:
https://dev.mysql.com/doc/refman/5.7/en/mysql-command-options.html#option_mysql_histignore
[4 Jul 2017 8:18] MySQL Verification Team
Hello John Hawkinson,

Thank you for the report.
As my colleague Shane pointed out, this behavior is expected and documented. Quoting from our manual  "mysql ignores for logging purposes statements that match any pattern in the “ignore” list. By default, the pattern list is "*IDENTIFIED*:*PASSWORD*", to ignore statements that refer to passwords" - https://dev.mysql.com/doc/refman/5.7/en/mysql-command-options.html#option_mysql_histignore and https://dev.mysql.com/doc/refman/5.7/en/mysql-logging.html

Thanks,
Umesh
[4 Jul 2017 13:46] John Hawkinson
Thanks, Shane and Umesh. I wasn't aware of the history ignore pattern, and I agree this is implemented-as-designed. But it is still a design bug!

A feature designed to preserve the security of passwords should not serve to limit the command-history for XML users. Like many tasks, constructing LOAD XML lines really benefits from command history.

The default history ignore pattern should not capture commands that are not security-sensitive. Unfortunately, IDENTIFIED BY and IDENTIFIED WITH are the most-restrictive strings one could reasonably use to recognize password-based commands, and IDENTIFIED BY would continue to capture LOAD XML.

This suggests LOAD XML's syntax was poorly designed.
I think the right solution here is to add something like

LOAD XML ... [ROWS DELIMITED BY '<tagname>']

as an alias for

LOAD XML ... [ROWS IDENTIFIED BY '<tagname>']

and then deprecate the IDENTIFIED BY syntax.

I'm not an SQL or XML language syntax design expert, maybe there's a more appropriate word to use. ROWS TAGGED BY, ROWS MARKED BY, ROWS ENCLOSED BY all seem like potential choices.

But I think DELIMITED BY tracks best with the concept and also the language of the XML standard, e.g. https://www.w3.org/TR/1998/REC-xml-19980210 in §2.1 (Well-Formed XML Documents) says, "More simply stated, the elements, delimited by start- and end-tags, nest properly within each other." Etc.

Although ENCLOSED BY is already used in MySQL
[FIELDS [OPTIONALLY] ENCLOSED BY 'char']
it's representing a different (but similar!) concept, and it also takes a single character delimiter rather than a multi-character string (tagname), so I think there's good cause to use a different keyword.

Thanks.
[5 Jul 2017 5:16] MySQL Verification Team
Thank you for the feedback John.

regards,
Umesh
[6 Apr 2018 10:56] Ceri Williams
https://bugs.mysql.com/bug.php?id=90323 suggests that the user can override with --histignore
[6 Apr 2018 11:16] MySQL Verification Team
Bug #90323 marked as duplicate of this one