Bug #85155 | Importing huge statement seems to cause memory problems | ||
---|---|---|---|
Submitted: | 23 Feb 2017 16:25 | Modified: | 20 Jul 2018 18:05 |
Reporter: | Thomas Benkert | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Command-line Clients | Severity: | S5 (Performance) |
Version: | 5.7.17 | OS: | CentOS |
Assigned to: | CPU Architecture: | Any | |
Tags: | import, Memory, statement |
[23 Feb 2017 16:25]
Thomas Benkert
[25 May 2017 19:54]
Scott Benkovic
Thomas have you found a solution? I am having a similar problem replaying binary log files since upgrading to 5.7. Worked fine in 5.1,5,6. https://community.oracle.com/thread/4036847 Scott
[26 May 2017 7:10]
Thomas Benkert
Yes, I actually found a solution, I'm surprised though, that this also happens with mysqlbinlog. The solution for me was to remove the newlines from the file. I wanted to update this bug report, but haven't found the time yet.
[26 May 2017 18:20]
Scott Benkovic
That helped at least figure out what is going on. When I extract the binary log that contains a 120MB blob: mysqlbinlog drrsdb1-bin.000001 > drrsdb1-bin.000001.sql Then use "tr -d '\r\n'" on the blob portion of the sql file to remove CR/LF. The file imports in a couple of seconds like previous versions: mysql -u root -pxxxxx < drrsdb1-bin.000001.sql If I don't remove CR/LF, it takes 20 minutes to import. The sql file that results from mysqlbinlog in both 5.6 and 5.7 produce the same file. Looks like its the way the 5.7 mysql client reads it when doing the import. None of the client options seem to make a difference. Anyway thanks for the information, Scott
[18 May 2018 15:09]
MySQL Verification Team
Hi, Thank you for this report. We have seen cases like this one many, many times. It is not a bug, but only a case of misconfiguration. Especially when it comes to all the various timeouts ......
[22 May 2018 7:03]
Thomas Benkert
Hi Sinisa, thanks for your reply, but do you mind explaining that a bit? As I wrote, the solution for me was to remove the newlines from the sql statement. I don't understand how that would be correlated to misconfiguration. The timeout is just a symptom, not the main cause.
[22 May 2018 12:18]
MySQL Verification Team
Hi, What BLOB are you writing about ??? Your table does not have any such column. Also, looks like you are writing about two different problems. First description shows the automatic closing of the sleeping connections. That is expected behaviour, regulated by one of the timeouts on the server. Second problem is totally unrelated and deals with a large BLOB, which does not exist and difference in the speed of importing it with or without \r\n. It is also unclear how can you remove newlines at all. This looks like mysqlbinlog was not used with proper options. Two different issues.
[22 May 2018 12:33]
Thomas Benkert
Hi Sinisa, the blob in a binary log thing is a comment from Scott, not from me. His problem seems to be related however. About the timeout, again this is just a symptom. The real problem is, that the mysql client has problems managing memory and this is somehow related to reading the newline character. The timeout happens, because the mysql client is too busy handling memory and in this time it doesn't send anything to the mysql server. Scott's comment just proves, that it's not an issue with my insert statement, but that it can also happen when the source is the binary log. I don't believe it's a misuse of mysqlbinlog.
[22 May 2018 12:39]
MySQL Verification Team
Again, these are two different cases. One is a sleeping connection, which timeouts, which is expected and designed behaviour. The other is the execution of the very large INSERT statement that takes only 5 seconds. Again, expected behaviour.
[22 May 2018 12:49]
Thomas Benkert
Sinisa, I don't know how to make this any more clear. I did not file a bug report because of the timeout. I filed this bug report because the mysql client is not capable of sending certain statements to the server because of memory issues. In the bug report I showed that the mysql client of version 5.6 works fine, whereas that of 5.7 does not. I wanted to provide information how MySQL is behaving, so that others can relate (maybe this helped Scott to identify that his problem may be related to this one). I'm sorry, that this (for you) unnecessary information is causing too much confusion. Next time I'll leave it out, so you can concentrate on the actual problem. Thanks.
[22 May 2018 12:53]
MySQL Verification Team
Hi, So, mysql client, as per your opinion, does not send statements to the server, for some unknown issue. What you can do is upload that SQL file with INSERT which does not get sent to the server and I will try it out.
[22 May 2018 15:37]
Thomas Benkert
Hello Sinisa, sure, no problem. I reproduced the bug again, although I have limited the test data because I have not much time at hand, so there was no timeout, but with enough data I bet there would be as the times to execute the statement differs hugely. How to reproduce: In MySQL create a table like this: create table reproduce_bug (a varchar(5), b varchar(5), c varchar(5), d varchar(5), e varchar(5), f varchar(5), g varchar(5), h varchar(5)); On the command line produce a test statement like this: echo "INSERT INTO reproduce_bug VALUES ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')" > rb for i in {1..1000000}; do echo ", ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')" >> rb ; done echo ';' >> rb This produces a file with 42MB in size. Then load the statement like this: time cat rb | mysql your_playground_schema In MySQL 5.6 I get this: real 0m57.942s user 0m0.464s sys 0m0.096s In 5.7 I get real 3m46.506s user 0m48.679s sys 1m34.243s VM with 5.6 has 1 core and 1GB RAM VM with 5.7 has 2 cores and 2GB RAM Configuration of MySQL is the same on both servers.
[31 May 2018 15:45]
MySQL Verification Team
Hi, I can't repeat your results. I used my two year old iMac, with 1 Gb RAM dedicated to MySQL server. My dump file contained also CREATE TABLE and was optimised for minimal locking, not for speed. Your table and your data were used. Here you can see the dump file size : rw-r--r-- 1 sinisa staff 35653667 May 31 16:42 tmp/reproduce.sql Here are the results on 5.6.36: $ time mysql test < tmp/reproduce.sql real 2m10.343s user 0m0.928s sys 0m0.086s Total number of rows in the table was 1048576. This is a speed on 5.7.21: $ time mysql test < tmp/reproduce.sql real 1m59.153s user 0m0.753s sys 0m0.091s Dump was not optimised for speed, but as I wrote, for minimum locking. Next, I used our debug binaries and not production binaries. Debug binaries are 20 - 30 % slower then our production binaries. It should be noted that this is a single-thread operation, so number of cores is irrelevant.
[1 Jun 2018 9:02]
MySQL Verification Team
The mysql process ends up going a gazillion memory reallocs... [x@x ~]$ for i in {1..5000000}; do echo ", ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')" >> rb ; done [x@x ~]$ echo ';' >> rb [x@x ~]$ ls -l rb -rw-rw-r--. 1 wwww wwww 215000076 Jun 1 10:35 rb [x@x ~]$ time ./mysql/release/mysql-5.7.17-linux-glibc2.5-x86_64/bin/mysql -uroot -h192.168.1.107 test < rb [x@x release]$ pstack 2995 #0 __memmove_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:2938 #1 0x000000000045704f in my_realloc (key=<optimized out>, ptr=0x7fec36e7b030, size=33437320, flags=<optimized out>) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/mysys/my_malloc.c:100 #2 0x0000000000419943 in String::mem_realloc (this=0xa467c0 <glob_buffer>, alloc_length=33437315, force_on_heap=<optimized out>) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql-common/sql_string.cc:115 #3 0x0000000000416f70 in add_line (truncated=false, ml_comment=<optimized out>, in_string=<optimized out>, line_length=42, line=0x27a3ae2 ", ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')\n, ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')\n, ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')\n, ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')\n, ('a', 'b', 'c', 'd', 'e', "..., buffer=...) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysql.cc:2795 #4 read_and_execute (interactive=false) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysql.cc:2367 #5 0x00000000004182da in main (argc=5, argv=0x2792da0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysql.cc:1447 Meanwhile... mysql> show processlist; +----+------+---------------------+------+---------+------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+---------------------+------+---------+------+----------+------------------+ | 2 | root | localhost:48303 | test | Query | 0 | starting | show processlist | | 3 | root | 192.168.1.105:40506 | test | Sleep | 230 | | NULL | <--- waiting around .... +----+------+---------------------+------+---------+------+----------+------------------+ 2 rows in set (0.00 sec)
[1 Jun 2018 9:21]
MySQL Verification Team
Seems to be this taking time on long lines: at bottom of add_line function: buffer.mem_realloc(buffer.length()+length+IO_SIZE); I didn't bother testing 5.6. It seems mysql client simply needs to be optimized to read long lines. Incrementally adding IO_SIZE (4096) to the memory allocation and recopying everything to new malloc is not optimal. How about using powers of two up to say 16M at a time to increase buffer size?
[1 Jun 2018 11:06]
MySQL Verification Team
Verified. 5.6 worked quickly (seconds) vs over an hour to read the file in 5.7.
[1 Jun 2018 14:17]
MySQL Verification Team
Hi Sinisa, the server is not involved in the bug so we can ignore it. The nature of this bug is the time taken for mysql command line client to read a long multi-line query. Reading times that grow exponentially and alloc/dealloc crazy number of times are not expected. In this test: where I double up the number $i range to make larger number of lines in query, echo "INSERT INTO reproduce_bug VALUES ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')" > rb for i in {1..10000}; do echo ", ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')" >> rb ; done echo ';' >> rb time ./mysql-5.7.22-linux-glibc2.12-x86_64/bin/mysql -uroot test < rb Here are the timings. The times should increase linearly, but it grows out of control quick. 5.7.22 client: numlines, real time -------------------------- 10000, real 0m0.474s 20000, real 0m0.464s 40000, real 0m1.385s 80000, real 0m3.064s 160000, real 0m10.877s 320000, real 0m40.425s 640000, real 2m41.129s 1280000, real 10m8.779s 2560000, real 44m11.166s by contract, 5.6.21 client: 2560000, real 0m26.464s Many people like dump-files with newlines for each row (as evidenced in https://bugs.mysql.com/bug.php?id=65465 ) since they are more readable.
[1 Jun 2018 15:49]
MySQL Verification Team
Setting severity to the correct value, since the cause of the performance degradation has been identified.
[21 Jun 2018 12:18]
MySQL Verification Team
Tested an internal preliminary fix from Sinisa. The standard build 5.7.22: ----------------------------- E:\mysql-5.7.22-winx64\bin>mysql -uroot test < f:\tmp1\rb ~600 seconds (and increasingly crazy memory) 5.7.22 with Sinisa’s patch: ---------------------------- E:\mysql-5.7.22-winx64\bin>F:\ade\mysql_archives\5.7\5.7.22\mysql-5.7.22\bld\client\RelWithDebInfo\mysql -uroot test < f:\tmp1\rb ~25 seconds (with steady memory)
[21 Jun 2018 14:54]
MySQL Verification Team
It should be noted that my patch is VERY preliminary and actually more proof-of-the-concept then anything else. My patch changes only code in the mysql CLI and nothing else. Server remains the same in code and configuration. I must also note that the performance improvements on the fully optimised dumps are very small. I will be sending my patch to the right address, for any (potential) scheduling and (truly) necessary changes.
[20 Jul 2018 18:05]
Paul DuBois
Posted by developer: Fixed in 8.0.13. The mysql client was slow to import large multiple-line statements in batch mode. Memory allocation is now more efficient for this situation.