Bug #93635 | MySQL high load with high system cpu used | ||
---|---|---|---|
Submitted: | 17 Dec 2018 8:11 | Modified: | 24 Apr 2019 12:45 |
Reporter: | victor zheng | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | mysql Ver 14.14 Distrib 5.7.20, for Lin | OS: | Debian (20) |
Assigned to: | CPU Architecture: | Any | |
Tags: | system cpu high |
[17 Dec 2018 8:11]
victor zheng
[17 Dec 2018 8:11]
victor zheng
mysql config: [client] port = 3306 socket = /tmp/mysql.sock [mysql] default-character-set = latin1 no-auto-rehash [mysqld] server_id=5934 max_connections=3000 datadir=/home/database/data tmpdir=/home/database/tmp default_storage_engine=MyISAM default_tmp_storage_engine=MyISAM core_file general_log=OFF slow_query_log=ON long_query_time=2 slave-skip-errors = 1062 ###replication### log_bin=/home/database/binlog/localhost-bin relay-log=/home/database/binlog/localhost-relay-bin #change sync_binlog 0->1 sync_binlog=1 #change log_slave_updates OFF->ON log_slave_updates=ON slave_net_timeout=3600 #change binlog_format STATEMENT->ROW binlog_format=ROW #mysql5.7 add relay_log_recovery=1 master_info_repository=TABLE relay_log_info_repository=TABLE ###multi_threads_slave### #mysql5.7 add #slave_parallel_type=LOGICAL_CLOCK #slave_parallel_workers=8 ###semi_sync### #rpl_semi_sync_master_enabled=1 #rpl_semi_sync_master_timeout=1000 #rpl_semi_sync_slave_enabled=1 query_cache_size=0 key_buffer_size=128M myisam_mmap_size=12M tmp_table_size=2M sort_buffer_size=2M max_heap_table_size=2M join_buffer_size=2M bulk_insert_buffer_size=2M explicit_defaults_for_timestamp=ON ###innodb### innodb_io_capacity=2000 innodb_io_capacity_max=20000 innodb_write_io_threads=16 innodb_data_home_dir=/home/database/innodb innodb_file_format=Barracuda innodb_file_format_max=Barracuda innodb_file_per_table=ON innodb_buffer_pool_size=128G innodb_sort_buffer_size=32M innodb_log_group_home_dir=/home/database/innodb innodb_log_files_in_group=3 innodb_log_file_size=1024M innodb_log_buffer_size=128M innodb_flush_log_at_trx_commit=1 innodb_flush_method=O_DIRECT innodb_print_all_deadlocks=ON wait_timeout=604800 interactive_timeout=604800 ###mysql5.7### sql_mode="" log_timestamps=system secure_file_priv='' # group replication #gtid_mode=ON #enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="13855fca-d2ab-11e6-8f37-005056b8286c" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "xxx:13306" loose-group_replication_group_seeds= "xxxx" loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode=off loose-group_replication_enforce_update_everywhere_checks=on group_replication_ip_whitelist="..."; slave_preserve_commit_order=ON thread_cached_size = 1024
[17 Dec 2018 8:12]
victor zheng
the result 1. perf top -z 77.83% [kernel] [k] _raw_spin_lock 1.06% [kernel] [k] try_to_wake_up 0.70% mysqld [.] MYSQLparse(THD*) 0.40% [kernel] [k] futex_wait_setup 0.29% mysqld [.] my_uca_scanner_next_any 0.27% libc-2.19.so [.] malloc 0.21% [kernel] [k] ixgbe_poll 2. poormanprofiler samples 23 ??,??,??,free_root,Prepared_statement::~Prepared_statement,Prepared_statement::~Prepared_statement,my_hash_delete,Prepared_statement_map::erase,Prepared_statement::deallocate,mysqld_stmt_close,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 20 ??,??,malloc,operator,mysqld_stmt_prepare,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 16 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,<Per_thread_connection_handler::LOCK_thread_cache>,,<Per_thread_connection_handler::LOCK_thread_cache>,,at,handle_connection,pfs_spawn_thread,start_thread,clone 5 ??,??,??,mem_free,__in_chrg=<optimized,stmt_id=<optimized,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 16 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,<Per_thread_connection_handler::LOCK_thread_cache>,,<Per_thread_connection_handler::LOCK_thread_cache>,,at,handle_connection,pfs_spawn_thread,start_thread,clone 7 ??,??,??,free_root,Prepared_statement::~Prepared_statement,Prepared_statement::~Prepared_statement,my_hash_delete,Prepared_statement_map::erase,Prepared_statement::deallocate,mysqld_stmt_close,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 5 ??,??,malloc,operator,mysqld_stmt_prepare,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 5 ??,??,malloc,my_raw_malloc,size=size@entry=8160,,alloc_root,Item::operator,MYSQLparse,parse_sql,Prepared_statement::prepare,mysqld_stmt_prepare,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 4 ??,??,??,my_realloc,String::mem_realloc,String::replace,String::replace,Prepared_statement::insert_params,Prepared_statement::set_parameters,Prepared_statement::execute_loop,mysqld_stmt_execute,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 4 ??,??,malloc,my_raw_malloc,size=size@entry=216,,my_realloc,String::mem_realloc,String::replace,String::replace,Prepared_statement::insert_params,Prepared_statement::set_parameters,Prepared_statement::execute_loop,mysqld_stmt_execute,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 3 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=0),srv_worker_thread,start_thread,clone ****** Another test with change the malloc with tcmalloc***** 1. the system cpu used is almost zero and qps is up 2. poormanprofiler samples: 38 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,<Per_thread_connection_handler::LOCK_thread_cache>,,<Per_thread_connection_handler::LOCK_thread_cache>,,at,handle_connection,pfs_spawn_thread,start_thread,clone 3 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=0),srv_worker_thread,start_thread,clone 3 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=0),buf_flush_page_cleaner_worker,start_thread,clone 1 ??,sigwaitinfo,timer_notify_thread_func,pfs_spawn_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=6282),srv_purge_coordinator_suspend,out>),start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=0),buf_resize_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,wait,out>,,reset_sig_count=reset_sig_count@entry=0),buf_dump_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,out>,,out>,,at,pfs_spawn_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,cond=0x1a2f6ade0),"/home/zhangjy/package/mysql-5.7.20/sql/binlog.cc",,thd=thd@entry=0x1a2fa6000,,next_event,thd=0x1a2fa6000),at,pfs_spawn_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_error_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,pc_sleep_if_needed,out>),start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,lock_wait_timeout_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,dict_stats_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,native_cond_timedwait,mp=0x1e258c0,"/home/zhangjy/package/mysql-5.7.20/sql/binlog.cc",,<mysql_bin_log>,,wait_with_heartbeat,this=0x7eeff06d4b50),this=0x7eeff06d4b50),log_cache=log_cache@entry=0x7eeff06d45f0,,Binlog_sender::run,mysql_binlog_send,com_binlog_dump,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 1 poll,vio_io_wait,vio_socket_io_wait,vio_read,vio_read_buff,net_read_raw_loop,net_read_packet_header,com
[17 Dec 2018 8:15]
victor zheng
two more question to ask 1. thread_cached_size is the key config to affect the system cpu. the system cpu is always high during the test. But the thread_created is not increased 2. question about general log * why open general log decrease the qps . greatly? because the io is not high and cpu is not high where it is opend. but i see many connection is connected and most is running (they record the gen log?)
[24 Dec 2018 2:01]
zhai weixiang
Looks like the problem is related to memory allocator.(we used to see similar problem on production). I'd suggest taking a try of jemalloc. :)
[24 Dec 2018 9:18]
victor zheng
@zhaiweixiang hi, thanks for your reply. yes, maybe it is about memory aollocation. we alsoe solve it by using tcmalloc instead. but I still have questions wandering in my mind. 1. the concurrency is just 100-200. it is not too high. why this heat the threshold of memory malloc? 2.which part of mysql memory malloc cause this problem? prepare statement struct? or what part in MYSQLparse(THD*) function ?
[19 Mar 2019 14:03]
MySQL Verification Team
Hi, Thank you for your report. I must say that this does not look like a bug report, but more like a request for the help with configuration of our server. We do not provide those services here. Only a couple of notes. Turn query cache, totally by adding: query_cache_type = 0 query_cache_limit = 1M Turn thread cache totally off. Increase tmp_table_size as much as you can, but keep in mind that it will be used concurrently by all threads that need intrinsic temporary tables in order to resolve queries. You have set buffer pool to 128 Gb. How much memory do you have ???? Last, but not least, you have turned all ACID-related variables ON, so what your profiler shows is expected behaviour. Your disk fsyncing will cause high CPU utilisation by kernel.
[20 Apr 2019 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".