Bug #27287 extractvalue() (and updatexml()) extremely slow for large XML
Submitted: 20 Mar 2007 12:29 Modified: 27 Nov 2007 16:34
Reporter: Peter Welker Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: XML functions Severity:S5 (Performance)
Version:5.1.16 OS:Windows (WinXP SP2)
Assigned to: Alexander Barkov CPU Architecture:Any

[20 Mar 2007 12:29] Peter Welker
Description:
extractvalue() and updateXML() on a mediumtext field with > 65kB of data gets extremely slow (e.g. 2 minutes on a Athlon 3200+ for 1 MB XML) whereas other string functions (replace, substr etc. remain fast (< 1 second for 1MB)).

How to repeat:
-- recreate table with XML columns
drop table if exists xmltest;
create table xmltest  ( xv  varchar(65000),
                        xt  text,
                        xm  mediumtext,
                        xl  longtext,
                        xb  blob)
       engine=MyISAM;

-- insert 1 record
insert into xmltest values ('<a><list><b>1</b><b2>2</b2></list></a>',
                            '<a><list><b>1</b><b2>2</b2></list></a>',
                            '<a><list><b>1</b><b2>2</b2></list></a>',
                            '<a><list><b>1</b><b2>2</b2></list></a>',
                            '<a><list><b>1</b><b2>2</b2></list></a>');

-- select empty result
select      extractvalue(xv, '/') as "varchar",
            extractvalue(xt, '/') as "text",
            extractvalue(xm, '/') as "mediumtext",
            extractvalue(xl, '/') as "longtext",
            extractvalue(xb, '/') as "blob"
    from    xmltest;

-- update table up to 65kB
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');

-- Copy to all other types
update xmltest set xt = xv,
                   xm = xv,
                   xl = xv,
                   xb = xv;

-- try to use extractvalue function
select      extractvalue(xv, '/a/list/b2') as "varchar",
            extractvalue(xt, '/a/list/b2') as "text",
            extractvalue(xm, '/a/list/b2') as "mediumtext",
            extractvalue(xl, '/a/list/b2') as "longtext",
            extractvalue(xb, '/a/list/b2') as "blob"
    from    xmltest
    limit   1 \G

-- update mediumtext
update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');

-- ~ 1MB
select length(xm) from xmltest;

-- What about non-XML runctions? - Still very fast!
select /*!SQL_NO_CACHE*/ length(replace(xm, 'b2', 'b3')), substr(xm, 1000000, 10) from xmltest;

-- test extractvalue() - does it still work? - yes but very slow
select /*!SQL_NO_CACHE*/ extractvalue(xm, '/a/list/b2') from xmltest;

-- copy to another engine
create table xmltest2 engine=innodb select * from xmltest;

-- What about non-XML runctions? - Still very fast!
select /*!SQL_NO_CACHE*/ length(replace(xm, 'b2', 'b3')), substr(xm, 1000000, 10) from xmltest2;

--try again - still very slow
select /*!SQL_NO_CACHE*/ extractvalue(xm, '/a/list/b2') from xmltest2;

-- cleanup
drop table if exists xmltest;
drop table if exists xmltest2;

Suggested fix:
make faster :-)
[20 Mar 2007 12:53] MySQL Verification Team
Thank you for the bug report. Could you please upgrade to latest released
version 5.1.16 and paste here the result you will get. Thanks in advance.
[20 Mar 2007 13:25] Peter Welker
Same result also for 5.1.16. Here is the execution log:
-------------------------------------------------------

mysql> -- recreate table with XML columns
mysql> drop table if exists xmltest;
Query OK, 0 rows affected (0.00 sec)

mysql> create table xmltest  ( xv  varchar(65000),
    ->                         xt  text,
    ->                         xm  mediumtext,
    ->                         xl  longtext,
    ->                         xb  blob)
    ->        engine=MyISAM;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> -- insert 1 record
mysql> insert into xmltest values ('<a><list><b>1</b><b2>2</b2></list></a>',
    ->                             '<a><list><b>1</b><b2>2</b2></list></a>',
    ->                             '<a><list><b>1</b><b2>2</b2></list></a>',
    ->                             '<a><list><b>1</b><b2>2</b2></list></a>',
    ->                             '<a><list><b>1</b><b2>2</b2></list></a>');
Query OK, 1 row affected (0.00 sec)

mysql>
mysql> -- select empty result
mysql> select      extractvalue(xv, '/') as "varchar",
    ->             extractvalue(xt, '/') as "text",
    ->             extractvalue(xm, '/') as "mediumtext",
    ->             extractvalue(xl, '/') as "longtext",
    ->             extractvalue(xb, '/') as "blob"
    ->     from    xmltest;
+---------+------+------------+----------+------+
| varchar | text | mediumtext | longtext | blob |
+---------+------+------------+----------+------+
|         |      |            |          |      |
+---------+------+------------+----------+------+
1 row in set (0.00 sec)

mysql>
mysql> -- update table up to 65kB
mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xv=replace(xv, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql>
mysql> -- Copy to all other types
mysql> update xmltest set xt = xv,
    ->                    xm = xv,
    ->                    xl = xv,
    ->                    xb = xv;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql>
mysql> -- try to use extractvalue function
mysql> select      extractvalue(xv, '/a/list/b2') as "varchar",
    ->             extractvalue(xt, '/a/list/b2') as "text",
    ->             extractvalue(xm, '/a/list/b2') as "mediumtext",
    ->             extractvalue(xl, '/a/list/b2') as "longtext",
    ->             extractvalue(xb, '/a/list/b2') as "blob"
    ->     from    xmltest
    ->     limit   1 \G
*************************** 1. row ***************************
   varchar: 2
      text: 2
mediumtext: 2
  longtext: 2
      blob: 2
1 row in set (0.23 sec)

mysql>
mysql> -- update mediumtext
mysql> update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.06 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (0.28 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (1.16 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (4.70 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update xmltest set xm=replace(xm, '<b>1</b>', '<b>1</b><b>1</b>');
Query OK, 1 row affected (19.08 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql>
mysql> -- ~ 1MB
mysql> select length(xm) from xmltest;
+------------+
| length(xm) |
+------------+
|    1048606 |
+------------+
1 row in set (0.00 sec)

mysql>
mysql> -- What about non-XML runctions? - Still very fast!
mysql> select /*!SQL_NO_CACHE*/ length(replace(xm, 'b2', 'b3')), substr(xm, 1000000,
    -> 10) from xmltest;
+---------------------------------+-------------------------+
| length(replace(xm, 'b2', 'b3')) | substr(xm, 1000000,
10) |
+---------------------------------+-------------------------+
|                         1048606 | b><b>1</b>              |
+---------------------------------+-------------------------+
1 row in set (0.00 sec)

mysql>
mysql> -- test extractvalue() - does it still work? - yes but very slow
mysql> select /*!SQL_NO_CACHE*/ extractvalue(xm, '/a/list/b2') from xmltest;
+--------------------------------+
| extractvalue(xm, '/a/list/b2') |
+--------------------------------+
| 2                              |
+--------------------------------+
1 row in set (2 min 11.39 sec)

mysql>
mysql> -- copy to another engine
mysql> create table xmltest2 engine=innodb select * from xmltest;
Query OK, 1 row affected (0.03 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql>
mysql> -- What about non-XML runctions? - Still very fast!
mysql> select /*!SQL_NO_CACHE*/ length(replace(xm, 'b2', 'b3')), substr(xm, 1000000,
    -> 10) from xmltest2;
+---------------------------------+-------------------------+
| length(replace(xm, 'b2', 'b3')) | substr(xm, 1000000,
10) |
+---------------------------------+-------------------------+
|                         1048606 | b><b>1</b>              |
+---------------------------------+-------------------------+
1 row in set (0.02 sec)

mysql>
mysql> --try again - still very slow
mysql> select /*!SQL_NO_CACHE*/ extractvalue(xm, '/a/list/b2') from xmltest2;
+--------------------------------+
| extractvalue(xm, '/a/list/b2') |
+--------------------------------+
| 2                              |
+--------------------------------+
1 row in set (2 min 11.47 sec)

mysql>
mysql> -- cleanup
mysql> drop table if exists xmltest;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists xmltest2;
Query OK, 0 rows affected (0.00 sec)
[20 Mar 2007 16:59] Sveta Smirnova
Thank you for the report.

Verified as described.
[5 Oct 2007 10:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34956

ChangeSet@1.2571, 2007-10-05 15:28:25+05:00, bar@mysql.com +1 -0
  Bug#27287 extractvalue() (and updatexml()) extremely slow for large XML
  Performance improvements made.
  ExtractValue for large XML values is now much faster
  (about 2000 times faster of 1Mb-long XML values).
[9 Oct 2007 8:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/35178

ChangeSet@1.2571, 2007-10-09 13:53:39+05:00, bar@mysql.com +1 -0
  Bug#27287 extractvalue() (and updatexml()) extremely slow for large XML
  Performance improvements made.
  ExtractValue for large XML values is now much faster
  (about 2000 times faster of 1Mb-long XML values).
[15 Oct 2007 9:30] Alexander Barkov
Pushed into mysql-5.1.23-rpl
[27 Nov 2007 10:49] Bugs System
Pushed into 5.1.23-rc
[27 Nov 2007 10:52] Bugs System
Pushed into 6.0.4-alpha
[27 Nov 2007 16:34] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.23 and 6.0.4 changelogs.
[3 Apr 2018 2:35] haochen he
How can I get version 5.1.16 ?
The oldest version in https://downloads.mysql.com/archives/community/ is 5.1.30 and 5.0.x, but seems the function extractvalue()  have not been supported in 5.0.x
Thanks!