Bug #80367 XML parser fail parse processing instruction
Submitted: 15 Feb 2016 10:37 Modified: 15 Feb 2016 10:48
Reporter: Ladislav Sopko (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: XML functions Severity:S2 (Serious)
Version:5.6, 5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: parser, PROCESSING INSTRUCTION, XML

[15 Feb 2016 10:37] Ladislav Sopko
Description:
Parser is working only for processing instructions in form:
<? identificator="value"?>

here is example of expression which is OK!

select extractvalue('<bar id="01573795" test=""><?xw-crc key32=af594c0d4-15810581?></bar>', '/bar/@id');

But processing instruction can contain anything.

These are 3 examples where XML parser incorrectly fail:

select extractvalue('<bar id="01573795" test=""><?xw-crc key32=1af594c0d4-15810581?></bar>', '/bar/@id');  
select extractvalue('<bar id="01573795" test=""><?xw-crc 1key32=af594c0d4-15810581?></bar>', '/bar/@id');
select extractvalue('<bar id="01573795" test=""><?xw-crc 1111111 ?></bar>', '/bar/@id');

all of them fail cause parser expect identificator which can't start with number. 
This rule is perfectly true, 
but not valid for processing instruction.

How to repeat:
Just run those expressions:

select extractvalue('<bar id="01573795" test=""><?xw-crc key32=1af594c0d4-15810581?></bar>', '/bar/@id');  
select extractvalue('<bar id="01573795" test=""><?xw-crc 1key32=af594c0d4-15810581?></bar>', '/bar/@id');
select extractvalue('<bar id="01573795" test=""><?xw-crc 1111111 ?></bar>', '/bar/@id');

All of them return NULL

Suggested fix:
Relax parser condition in case of processing instruction parsing.
[15 Feb 2016 10:40] Ladislav Sopko
Possible solution for problem

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-Fix-for-Processing-Instructions-parsing-PI-examples.patch (application/octet-stream, text), 3.35 KiB.

[15 Feb 2016 10:48] MySQL Verification Team
Hello Ladislav,

Thank you for the report and contribution.

Thanks,
Umesh