Bug #60696 | invalidating query cache entries (table) hangs replication | ||
---|---|---|---|
Submitted: | 30 Mar 2011 8:33 | Modified: | 7 Jul 2017 9:08 |
Reporter: | Alan Hollis | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: Query Cache | Severity: | S2 (Serious) |
Version: | 5.5.8, 5.5.20 | OS: | Any (Windows server 2008r2, Linux) |
Assigned to: | CPU Architecture: | Any | |
Tags: | query cache, replication |
[30 Mar 2011 8:33]
Alan Hollis
[30 Mar 2011 12:03]
Valeriy Kravchuk
Please, send the output of: show variables like 'query_%'; show status like 'qcache%';
[30 Mar 2011 12:17]
Alan Hollis
As requested. The process list is still showing the invalidating query cache at this time too. mysql> show variables like 'query_%'; +------------------------------+---------+ | Variable_name | Value | +------------------------------+---------+ | query_alloc_block_size | 8192 | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 0 | | query_cache_type | OFF | | query_cache_wlock_invalidate | OFF | | query_prealloc_size | 8192 | +------------------------------+---------+ 7 rows in set (0.01 sec) mysql> show status like 'qcache%'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Qcache_free_blocks | 0 | | Qcache_free_memory | 0 | | Qcache_hits | 0 | | Qcache_inserts | 0 | | Qcache_lowmem_prunes | 0 | | Qcache_not_cached | 0 | | Qcache_queries_in_cache | 0 | | Qcache_total_blocks | 0 | +-------------------------+-------+ 8 rows in set (0.03 sec)
[30 Mar 2011 12:20]
Valeriy Kravchuk
Sorry, missed GLOBAL keyword in the statements. Are the results any different with SHOW GLOBAL..?
[30 Mar 2011 12:24]
Alan Hollis
No problem. mysql> show global status like 'qcache%'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Qcache_free_blocks | 0 | | Qcache_free_memory | 0 | | Qcache_hits | 0 | | Qcache_inserts | 0 | | Qcache_lowmem_prunes | 0 | | Qcache_not_cached | 0 | | Qcache_queries_in_cache | 0 | | Qcache_total_blocks | 0 | +-------------------------+-------+ 8 rows in set (0.01 sec) mysql> show global variables like 'query_%'; +------------------------------+---------+ | Variable_name | Value | +------------------------------+---------+ | query_alloc_block_size | 8192 | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 0 | | query_cache_type | OFF | | query_cache_wlock_invalidate | OFF | | query_prealloc_size | 8192 | +------------------------------+---------+ 7 rows in set (0.02 sec)
[30 Mar 2011 14:14]
Valeriy Kravchuk
Do you use ROW or MIXED mode replication? See bug #38551 (must be fixed in 5.5.8, but anyway). Also, check INNODB STATUS for thread with this status reported. See Bug #59899. Finally there is not yet solved bug #53375.
[30 Mar 2011 14:27]
Alan Hollis
It was ROW based replication when the problem was occuring. I've now set this to mixed to see if this stops the problem as a workaround. I'm discounting #38551 as this happened more than once since after turning the query cache off. Unfortunately the thread finally ended so I can't checkout whether #59899 is related or not. #53375 seems to suggest this being a consequence of primary keys. Everyone of the tables in the database has a primary key. Before it died the query had been going for a very long time, the last time I did a process list showed. 7929352 | system user | | NULL | Connect | 65039 | invalidating query cache entries (table) Please let me know if I can be any more help. I'm more than happy to change back to row based replication and wait for this to happen again, as I've taken other work around to make sure this isn't an issue.
[30 Mar 2011 16:22]
Valeriy Kravchuk
If you'll ever see this again, please, get INNODB STATUS while thread is hanging in this state. Also, please, send SHOW CREATE TABLE for all tables used in the query that caused this hang, if possible.
[30 Apr 2011 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[10 Jun 2011 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Jun 2011 8:45]
Alan Hollis
Okay, it's taken a long time to happen again. But it's happened again. ===================================== 110621 9:44:07 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 49 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 98256 1_second, 98253 sleeps, 8937 10_second, 38504 background, 38480 flush srv_master_thread log flush and writes: 102851 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 99157, signal count 98161 Mutex spin waits 55615, rounds 1687416, OS waits 2257 RW-shared spins 48751, rounds 1513582, OS waits 47860 RW-excl spins 43794, rounds 1471200, OS waits 46372 Spin rounds per wait: 30.34 mutex, 31.05 RW-shared, 33.59 RW-excl ------------ TRANSACTIONS ------------ Trx id counter B3ACF9AD Purge done for trx's n:o < B3ACF8CF undo n:o < 0 History list length 2694 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 1477, query id 66550116 localhost ::1 root SHOW ENGINE INNODB STATUS ---TRANSACTION B3ACF8CD, not started MySQL thread id 2, query id 66550116 invalidating query cache entries (table) -------- FILE I/O -------- I/O thread 0 state: wait Windows aio (insert buffer thread) I/O thread 1 state: wait Windows aio (log thread) I/O thread 2 state: wait Windows aio (read thread) I/O thread 3 state: wait Windows aio (read thread) I/O thread 4 state: wait Windows aio (read thread) I/O thread 5 state: wait Windows aio (read thread) I/O thread 6 state: wait Windows aio (write thread) I/O thread 7 state: wait Windows aio (write thread) I/O thread 8 state: wait Windows aio (write thread) I/O thread 9 state: wait Windows aio (write thread) Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] , ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 167027 OS file reads, 31194752 OS file writes, 294653 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 35765, seg size 35767, 104991 merges merged operations: insert 4327531, delete mark 2559019, delete 1087032 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 10624987, node heap has 89579 buffer(s) 0.00 hash searches/s, 0.00 non-hash searches/s --- LOG --- Log sequence number 1004414414280 Log flushed up to 1004414414280 Last checkpoint at 1004414414280 0 pending log writes, 0 pending chkp writes 25275000 log i/o's done, 0.00 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 5494538240; in additional pool allocated 0 Dictionary memory allocated 2076651 Buffer pool size 327680 Free buffers 0 Database pages 238101 Old database pages 87872 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 41414, not young 0 0.00 youngs/s, 0.00 non-youngs/s Pages read 167016, created 197710, written 5792316 0.00 reads/s, 0.00 creates/s, 0.00 writes/s No buffer pool page gets since the last printout Pages read ahead 0.00/s, evicted without access 0.00/s LRU len: 238101, unzip_LRU len: 0 I/O sum[0]:cur[0], unzip sum[0]:cur[0] -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Main thread id 1900, state: waiting for server activity Number of rows inserted 5042373, updated 8790786, deleted 5577, read 26550813 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ I'm not sure how I find out which query is causing this? Help would be greatly appreciated!
[21 Jun 2011 9:09]
Alan Hollis
Just to add, I've upgraded this server to 5.5.13-log.
[22 Jun 2011 9:37]
Andrei Elkin
It also hangs in invalidating on gcov built/linux: http://pb2.norway.sun.com/?template=mysql_show_test_failure&test_failure_id=3603326.
[23 Jun 2011 8:41]
Alan Hollis
So I've found something interesting. If I do a mysql dump on one database in particular the invalidating query cache gets put into a waiting for table level lock. So I'm assuming it must be some query I'm running on a table in that database? I'll investigate further.
[18 Aug 2011 19:58]
Sveta Smirnova
Thank you for the feedback. Please update us with results of your investigation or, at least, send us dump (maybe mysqldump --no-data) of involved tables, so we can try to recreate this on our side too.
[26 Aug 2011 19:45]
Ryan Hendrickson
Hello, I'm getting the same error. When you run a show processlist it's always in the "invalidating query cache entries (table)" state. I've actually turned off the inserts to the master such that the database should be static, yet the master-bin log file continues to increase in size. The slave continues to move forward on the position, but not fast enough to catch-up to the master. My assumption is that each time it reads a position or several positions from the binlog it has to go and invalidate the query cache entry table due to updates, and that process is super-expensive. Is there a way to just clear the cache table entries for all updates from a binlog file if it's behind in time? slave info: os version: redhat linux mysql v5.5.15-log binlog format= mixed file: master-bin.000426 pos: 353,906,762 id | user | host | db | Command | Time | State | Info 18796 | system user | | NULL | Connect | 802 | Waiting for master to send event | NULL 18797 | system user | | NULL | Connect | 6265 | invalidating query cache entries (table) | NULL master info: os version: redhat linux mysql v5.5.15-log. binlog format= mixed file: master-bin.000426 pos: 610,803,498 id | user | host | db | Command | Time | State | Info 5566 | repl | localhost | NULL | Binlog dump | 1336 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL
[8 Feb 2012 19:15]
Valeriy Kravchuk
Please, check if the same problem still happens with a newer version, 5.5.20.
[6 Mar 2012 19:21]
Dima Rogozin
Yes, I see the same behavior on my newly upgraded 5.5.20.
[28 Apr 2012 16:56]
Valeriy Kravchuk
Bug #64266 was marked as a duplicate of this one.
[13 Jun 2012 7:07]
Chris Schneider
I have the same problem on my system and have the following pt-pmp output: pt-pmp -p $(pidof mysqld); mysql -uroot -e "show processlist;" Wed Jun 13 00:04:07 PDT 2012 warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff915fd000 9 pthread_cond_wait,one_thread_per_connection_end,handle_one_connection,start_thread,clone 4 pthread_cond_wait,os_event_wait_low,os_aio_simulated_handle,fil_aio_wait,::??,start_thread,clone 1 select,os_thread_sleep,srv_monitor_thread,start_thread,clone 1 select,os_thread_sleep,srv_master_thread,start_thread,clone 1 select,os_thread_sleep,srv_lock_timeout_thread,start_thread,clone 1 select,os_thread_sleep,srv_error_monitor_thread,start_thread,clone 1 select,handle_connections_sockets,main 1 read,vio_read_buff,::??,my_net_read,cli_safe_read,handle_slave_io,start_thread,clone 1 read,::??,my_net_read,do_command,handle_one_connection,start_thread,clone 1 pread64,::??,os_file_read,fil_io,::??,buf_read_page,buf_page_get_gen,btr_cur_search_to_nth_level,::??,row_search_for_mysql,ha_innobase::general_fetch,Rows_log_event::find_row,Delete_rows_log_event::do_exec_row,Rows_log_event::do_apply_event,apply_event_and_update_pos,handle_slave_sql,start_thread,clone 1 do_sigwait,sigwait,signal_hand,start_thread,clone +-----+-------------+-------------------+------------+---------+--------+------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +-----+-------------+-------------------+------------+---------+--------+------------------------------------------+------------------+ | 1 | system user | | NULL | Connect | 600 | Waiting for master to send event | NULL | | 2 | system user | | NULL | Connect | 106673 | invalidating query cache entries (table) | NULL | | 78 | somedatabase | x.x.x.x:33934 | sa1reports | Sleep | 306 | | NULL | | 183 | root | localhost | NULL | Query | 0 | NULL | show processlist | +-----+-------------+-------------------+------------+---------+--------+------------------------------------------+------------------+ Not sure if this will help but I am on version 5.1.60-log x86_64.
[26 Jul 2012 19:43]
Sveta Smirnova
Thank you for the feedback. "Invalidating query cache" status can be separate bug only if query_cache_size and query_cache_type is 0. If it is ON and queries are really in cache it should be cleaned. I tend to think this bug has same source of issues as bug #65336: I could only repeat call which leads to status described when I started slave with option --query_cache-type=1, then changed it to OFF online. In other cases slave did not enter this option and this bug could not occur. Please check if you either have query_cache_type ON or 1. If SHOW GLOBAL VARIABLES shows query_cache_type is OFF, please ensure configuration file contains option query_cache_type=0. Please inform us about results of your checking.
[26 Jul 2012 19:49]
Sveta Smirnova
To test this I created MTR test case like below: --source include/master-slave.inc --source include/have_innodb.inc connection master; set binlog_format='row'; create table t1(f1 text) engine=innodb; insert into t1 values(md5(1)); insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; sleep 1; connection slave; select * from t1; connection master; insert into t1 select * from t1; sleep 1; connection slave; show variables like 'query_cache%'; Then attached a debugger to slave and added following breakpoints: Rows_log_event::do_apply_event Query_cache::invalidate_locked_for_write Query_cache::invalidate_table sql_cache.cc:1892 With statement-based format invalidate_locked_for_write, which sets problem status, was not called.
[26 Jul 2012 19:56]
Sveta Smirnova
After some thinking considered to verify it now, because: 1. Problem is only repeatable with row-based replication 2. According to http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_query_cache_wlo...: "Normally, when one client acquires a WRITE lock on a MyISAM table, other clients are not blocked from issuing statements that read from the table if the query results are present in the query cache. Setting this variable to 1 causes acquisition of a WRITE lock for a table to invalidate any queries in the query cache that refer to the table. This forces other clients that attempt to access the table to wait while the lock is in effect. " lock should happen only for MyISAM tables and only if option query_cache_wlock_invalidate set to ON. But never for InnoDB tables and if query_cache_wlock_invalidate is OFF (default). 3. Why query cache is involved at all is same issue as bug #65336
[12 Nov 2012 11:09]
liu hickey
Today we hit such issue again, caused the slave large lag. I looked into the source again, and not puzzled the "invalidating query cache entries (table)" any more. Current implement of QC has flaws, which can refer to: http://bugs.mysql.com/bug.php?id=65336 https://bugs.launchpad.net/percona-server/5.5/+bug/1021131 For the high workload slaves, it's more obvious to catch such issue.
[7 Jul 2017 9:08]
Erlend Dahl
MySQL will no longer invest in the query cache, see: http://mysqlserverteam.com/mysql-8-0-retiring-support-for-the-query-cache/