Bug #67808 | in innodb engine, double write and multi-buffer pool instance reduce concurrency | ||
---|---|---|---|
Submitted: | 5 Dec 2012 2:45 | Modified: | 26 Dec 2012 2:34 |
Reporter: | wang huai | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.5.* | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | concurrency, double write, innodb, multi-buffer pool instance |
[5 Dec 2012 2:45]
wang huai
[5 Dec 2012 22:00]
Sveta Smirnova
Thank you for the report. Please send us your configuration file and configuration of your operating system: how many cores do you have, etc.
[7 Dec 2012 2:14]
wang huai
I'm so sorry that I forget to reply this letter, so I attachment the configure option and mechine information. #MySQL 5.5.8 [client] port = 3306 socket = /data/var/mysql.sock [mysqld] port = 3306 socket = /data/var/mysql.sock user=mysql skip-slave-start skip-name-resolve key_buffer_size=2M read_buffer_size=2M sort_buffer_size=2M #5.1.3 table_cache deprecated table_open_cache=16384 max_allowed_packet = 64M max_connect_errors = 1000000 #net_buffer_length = 8K #read_rnd_buffer_size = 512K #default-storage-engine=InnoDB character-set-server=utf8 collation-server=utf8_bin init-connect='SET NAMES utf8' max_connections=32000 binlog_cache_size=2M query_cache_size = 256M query_cache_limit = 2M slave_apply_threads=8 #READ-UNCOMMITTED, READ-COMMITTED, REPEATABLE-READ, SERIALIZABLE transaction_isolation = REPEATABLE-READ tmp_table_size=256M #--------log--------- slow_query_log=1 slow_query_log_file=slow.log long_query_time = 1 log-error = err.log log_slave_updates=0 expire_logs_days=99 binlog_format=statement log-bin=fb-bin #last intranet ip server-id = 162051 datadir=/data/var tmpdir=/dev/shm #--------innodb-------------- #innodb_data_home_dir = /data/ibdata innodb_autoextend_increment = 256 innodb_data_file_path=ibdata1:10M;ibdata2:10M:autoextend innodb_table_locks = 0 innodb_lock_wait_timeout = 10 innodb_old_blocks_time=1000 #innodb_change_buffering=none #(50-80)% * total memory ( 5G ) innodb_buffer_pool_size=10G innodb_buffer_pool_instances=8 innodb_additional_mem_pool_size=128M innodb_max_dirty_pages_pct=80 innodb_read_io_threads = 8 innodb_write_io_threads =8 innodb_io_capacity=2000 #innodb_doublewrite=0 #ssd #innodb_flush_method = O_DSYNC #A recommended value is 2 times the number of CPUs plus the number of disks. #5.5.8 default 0 innodb_thread_concurrency = 32 #innodb_log_group_home_dir = /data/iblogs innodb_log_files_in_group =3 innodb_flush_log_at_trx_commit =0 sync_binlog=3000 innodb_support_xa=0 innodb_file_per_table=1 innodb_log_file_size = 1G #25% *buffer pool size (1G) innodb_log_buffer_size = 96M relay-log=fb-relay ## the mechine information is follows: 8 kernals cpu 2.4GHZ 256G memary and is there anything to need?
[7 Dec 2012 21:31]
Sveta Smirnova
Thank you for the feedback. I can not repeat described behavior with sysbench OLTP test. With --innodb_doublewrite=1 --innodb_buffer_pool_instances=4 (I have 4 cores, no sense to have more): OLTP test statistics: queries performed: read: 3463586 write: 989595 other: 494797 total: 4947978 transactions: 247398 (266.02 per sec.) deadlocks: 1 (0.00 per sec.) read/write requests: 4453181 (4788.32 per sec.) other operations: 494797 (532.03 per sec.) With --innodb_doublewrite=0 --innodb_buffer_pool_instances=4: OLTP test statistics: queries performed: read: 2332330 write: 666380 other: 333190 total: 3331900 transactions: 166595 (179.13 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 2998710 (3224.38 per sec.) other operations: 333190 (358.26 per sec.) With --innodb_doublewrite=1 --innodb_buffer_pool_instances=4 --innodb_read_io_threads=4 --innodb_write_io_threads=4 --innodb_thread_concurrency=8 even much slower: OLTP test statistics: queries performed: read: 1745408 write: 498687 other: 249343 total: 2493438 transactions: 124671 (134.05 per sec.) deadlocks: 1 (0.00 per sec.) read/write requests: 2244095 (2412.97 per sec.) other operations: 249343 (268.11 per sec.) Please try with smaller innodb_thread_concurrency (I suggest 8 for your environment) and inform us if this solves the problem.
[8 Dec 2012 2:47]
wang huai
I'm very happy to receive you reply, this problem can not been tested by normal method, there are several points to been mind, they ara as follow: 1. log is very much and speed of generation is very fast. 2. there have several sql threads almost like the number of multi-buffer pool instances. 3. the several sql threads are executed in parallel mode, like multi-sessions execution. 4. you have to watch the procedure of running, but not the result of it. because at the start of execution, the log is little, so the speed is high, but after a short while(several minutes), there have lot of log, innodb have to do checkpoint, and from this time, the QPS is lower and lower. 5. session attribute is autocommit every sql statement, and every sql is modify the tables, there is no read sql. The QPS mentioned above is the transactions have been completed per minute. the method as follows: use information_schema; select variable_value into @num_com from global_status where variable_name ='COM_COMMIT'; select variable_value into @num_roll from global_status where variable_name ='COM_ROLLBACK'; select variable_value into @uptime from global_status where variable_name ='uptime'; select @num_com, @num_roll, @uptime, (@num_com+@num_roll)/@uptime; the QPS is the result of the last sql statement; do as above, it should show the problem, if it still not, please check the function log_free_check and log_check_margins, check the double write's mutex, because flush the page is multi-threads, but the mutex is single, so others thread must to wait, because many user threads can enter the function log_check_margins and log_checkpoint_margin in the same time. thank you very much
[12 Dec 2012 18:03]
Sveta Smirnova
Thank you for the feedback. I still can not repeat described behavior. Actually sysbench OLTP test case does something like you described here. Could you please create repeatable test case? Would be better if you take SysBench from Launchpad: https://launchpad.net/sysbench and write a test in lua format (example is oltp.lua in tests/db/oltp.lua
[13 Dec 2012 4:26]
wang huai
I want to say, the test of this problem with sysbench is not reasonable, because it has only one table's sql statement, so the concurrency of mysql can't been very high. so if want to improve the concurrency of mysql, there must be 8 session like me, and every session process one table, they are different each other. every session is execute the insert statement. please change one computer with 8 cpu cores, with the configure that I give you last. you should check the QPS which I give you the compute method last time. once the mysql start and you start test, make sure the uptime status is start from 0, from this time, check the QPS one time after few minutes and you should find the problem. I hope your success. thank you.
[13 Dec 2012 13:33]
Sveta Smirnova
Thank you for the feedback. But SysBench from Launchpad is scriptable and allows as many tables as you want. I used 8 tables. This is why I asked you about test case.
[13 Dec 2012 14:04]
wang huai
Thank you for your feedback. The test case is multi-threads slave replication, this is distributed by table name, I configure 8 threads, SQL thread take an event to one thread by table name hash. There have lot of BINLOG of master , so the slave is very tired. It is executed by function apply_event. The environment about this problem is like this. I do not know why the problem is not find again . And here, my classmate can also repeat it. The method is like sysbench. Is The buffer pool instance have 8?
[13 Dec 2012 14:29]
Sveta Smirnova
Thank you for the feedback. Let me explain why I ask you about test case first. There are known performance improvement impacts, caused by number of concurrency threads, doublewrite buffer and so on. Related described at: http://dimitrik.free.fr/blog/archives/2012/10/mysql-performance-innodb-buffer-pool-instanc... http://dimitrik.free.fr/blog/archives/2011/01/mysql-performance-innodb-double-write-buffer... http://dimitrik.free.fr/blog/archives/2009/08/mysql-performance-innodb-doublewrite-buffer-... In past (prior 5.5) doublewrite was expected to slowdown performance a bit, but, according to last article, it even increases performance now. Also, number of buffer pools and concurrent threads should vary from number of cores too. For example, > Is The buffer pool instance have 8? If I set number of buffer pool instances on my 4-cores machine I would get performance decrease and this would not be a bug, but hardware limitation. So I need to use 4 (or find better machine). Generally, if I repeat sequence of actions properly, performance improvement/degradation should scale here. And this is yet another reason why I want test case from you before start trying to find better machine: I want to be sure I am doing right things. Actually SysBench or not does not matter: just something which we can compile and run.
[13 Dec 2012 14:45]
wang huai
thank you for your feedback In the last letter, I said the case, it is not suitable for you, and now, the only different between us is computer cores and CPU speed, which can result in different concurrent and compress of MySQL, and this problem must ask this point----high speed log generation, do the checkpoint frequently, so the bottleneck can show. I think 4 concurrents is not enough. Thank you for your review.
[13 Dec 2012 15:58]
Sveta Smirnova
Thank you for the feedback. OK, will try to find better machine and run same test. I also see one more difference between what I tested and your last comment: I ignored the fact it is slave. Could you please confirm if all this load slave receives from master?
[13 Dec 2012 23:48]
wang huai
I think it is not important if it is slave, it is only simple concurrency, and all the SQL are come from master, but you can not test with slave, because it is only single SQL thread, and why I can, because I have modify the source and support the simple SQL concurrency, which our application is satisfied, and in this project test, I found the problem. In your test, make sure long time, and watch the qps's change. And can also check the util in iostat x 1, it is very high when Innodb do the first checkpoint, it is 90% all the time. My disk is not ssd, is stat, the buffer pool you can configure very large
[14 Dec 2012 11:31]
Sveta Smirnova
Thank you for the feedback. So you don't use our multi-threaded slave, but your own patch. Yes, I see slave_apply_threads=8 is not slave_parallel_workers=8 Is this problem repeatable with not modified MySQL from Oracle?
[14 Dec 2012 11:45]
wang huai
it is existed also, because in order to verified this problem, my classmate also test it in other method, he write a process like the multi-threaded slave's concurrency, it is also existed. the multi-threaded slave is my own patch, distributed by tablename, only this can result in high concurrency. so i think you should find a better computer and set more threads to insert different tables, like this, it should show this problem. thank you for your feedback, my English is poor, some sentences possible not suitable, Please forgive me. thank you very much.
[19 Dec 2012 3:40]
Mark Callaghan
Your English and performance debugging efforts are good. I learned something by reading this. I think that storage HW will make it easy or hard to reproduce this. Writes are much slower without HW RAID or SSD storage. Are you using either?
[19 Dec 2012 4:23]
wang huai
thank you for your praise. the disk I used is raid, but I do not know what is storage HW as you said, I want to know whether the problem is found. but I think it is repeated agian very simple.
[19 Dec 2012 6:26]
Justin Swanhart
Why can't you provide your classmate's test program?
[20 Dec 2012 6:37]
wang huai
the repeated case
Attachment: 1 (text/plain), 10.59 KiB.
[20 Dec 2012 17:57]
Sveta Smirnova
Thank you for the test case. You can attache your test_server program to the bug report or upload it to Oracle FTP server as described in "Files" tab.
[21 Dec 2012 4:19]
wang huai
ok, I have upload the program file to the oracle ftp, the file name is test_server.tar.gz, I don not know whether you can find it, thank you, good luck. it can be run directly.
[25 Dec 2012 13:49]
Sveta Smirnova
Thank you for the test case. With test case I can repeat slight slowdown with innodb_doublewrite=1 and innodb_buffer_pool_instances=8. But I didn't get stalls like in your case: 69171 92478 3071 87945 3 96386 2 96027 4 92605 24 87716 91 86999 121 88956 137 81008 279 91037 447 95036 1786 8941 Nothing like this. In my case this looks like: $cat dw1t8.log | sort | head -n 50 18425 19132 20581 20981 21217 ... 26442 26493 and with innodb_doublewrite=1 and innodb_buffer_pool_instances=1 results are slightly better: $cat dw1t1.log | sort | head -n 20 19108 21302 23201 ... 26210 26372 27051 Note, first output is lowest 50 values while second is lowest 20 only.
[26 Dec 2012 2:34]
wang huai
I'm very glad to receive you reply. but why didn't you repeat the result? are you do the operation follow my last letter step by step? is it 8 threads concurrency? is it 8 tables? is is 200000 rows per table? is the computer have 8 cores? do you use my test case completely? do you use the parameter completely what I give you? if they are, I think you can repeat easyly. note, the version I used is 5.5.27, but as I know, 5.5* are all have this problem. thank you for your patient.
[26 Dec 2012 10:52]
Sveta Smirnova
Thank you for the feedback. > I'm very glad to receive you reply. but why didn't you repeat the result? Because this is not repeatable on my machine with all settings like you provided.
[13 Jun 2013 18:01]
Inaam Rana
Wang Huai, There are two separate issues. Lets not mix them up. We have one doublwrite buffer system wide. This buffer has 128 pages. All writes to database pass through this double write buffer. As we scale better with multiple buffer pool instances it is a possibility that this doublewrite buffer becomes a bottleneck. We are aware of this and are working on it. log_free_check() is not related to doublewrite buffer directly. It is about redo log and when and how we can reuse it. The pain point is when we have to flush the pages from the buffer pool and not the actual writing of the checkpoint record to the redo log. When we make all threads wait there is a good reason for it. We cannot attempt to reuse redo log until all the dirty pages belonging to that are already flushed. We made all threads wait on purpose. It is working as designed. If you see log_free_check() causing stalls then the configuration workaround is to increase the redo log size and enable adaptive flushing. You have to set relevant parameters to make adaptive flushing more aggressive like innodb_io_capacity, innodb_io_capacity_max (in 5.6) etc.