Bug #34578 mysql client is extremely slow for large SQL statements
Submitted: 15 Feb 2008 2:22 Modified: 15 Feb 2008 4:39
Reporter: Mark Callaghan Email Updates:
Status: Duplicate Impact on me:
Category:MySQL Server: Command-line Clients Severity:S1 (Critical)
Version:5.0.54 OS:Mac OS X
Assigned to: CPU Architecture:Any
Tags: large, MySQL, slow, SQL

[15 Feb 2008 2:22] Mark Callaghan
I am trying to measure the time required to reload a database. Tables were dumped with 'mysqldump --extended-insert ...' so the SQL statements in the dump file are big. When trying to reload the database using 'mysql < dump.out' where dump.out is the output of mysqldump, the load takes forever and top shows that all of the CPU time is in the 'mysql' client process.

By forever I mean it doesn't finish after several hours and it should finish in 60 seconds.

oprofile shows that all of the time is in read_and_execute() and its children. Other monitoring shows that the time is in add_line(). add_line() iterates over a SQL statement one character at a time. The loop is like:
  for (pos = statement_start .. statement_end)
     len = strlen(pos)

The cost for this is O(n*n) where n is the size of the SQL statement. The code is new in 5.0.54 (it is not in 5.0.51). 

The problem code starts at line 1366 in mysql.cc:
    else if (!*ml_comment && !*in_string &&
             strlen(pos) >= 10 &&
             !my_strnncoll(charset_info, (uchar*) pos, 10,
                           (const uchar*) "delimiter ", 10))

How to repeat:
Load a table.
Dump the table with mysqldump --extended-insert > mysqldump.output
Truncate the table
Reload the table using 'mysql ... < mysqldump.output'

Suggested fix:
Don't call strlen() in the inner loop
[15 Feb 2008 4:39] Valeriy Kravchuk
This is a duplicate of infamous Bug #33057. Fixed in 5.0.56 (coming soon).