Bug #35165 Logging to table is slow for large queries
Submitted: 8 Mar 2008 17:48 Modified: 12 Mar 2008 21:48
Reporter: Davi Arnaut (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: CSV Severity:S5 (Performance)
Version:5.1 OS:Any
Assigned to: CPU Architecture:Any
Tags: log, performance
Triage: Triaged: D4 (Minor) / R1 (None/Negligible) / E2 (Low)

[8 Mar 2008 17:48] Davi Arnaut
Description:
The problem is in the ha_tina::encode_quote function. The initial reserved size for the the quoted format buffer (string) is 4096 bytes. If a query is larger than 4096 bytes, the buffer is reallocated every time the buffer is full to it's size plus eight bytes. If the query is much larger than 4096 bytes, this causes excessive reallocation (every eight bytes added to the buffer), probably leading to lot's of memory copying and lock contention inside malloc.

How to repeat:
CREATE TABLE t1 (a text);
let $text= `SELECT REPEAT("a", 100000)`;
eval INSERT INTO t1 VALUES ("$text");

Suggested fix:
Rework the ha_tina::encode_quote function to calculate the final size of the buffer and allocate only once and to use standard and optimized (memchr, memcpy) functions for buffer scanning and copying.
[17 Mar 2008 19:57] Davi Arnaut
Partial patch

Attachment: csv-encode-quote.patch (, text), 2.62 KiB.

[17 Mar 2008 19:58] Davi Arnaut
The above patch speeds up things a bit in my testing, but it still has a problem that it will cause a reallocation for string which are not nul terminated. Patch also removes bogus *ptr++.
[25 Mar 2009 8:17] Alexey Kopytov
This bug was the reason for bug #43801 "mysql.test takes too long, fails due to expired timeout on debx86-b in PB".