Bug #106537 | mysql innodb deadlock memory leak | ||
---|---|---|---|
Submitted: | 22 Feb 2022 13:56 | Modified: | 18 Oct 2022 11:44 |
Reporter: | pavel rekun | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S1 (Critical) |
Version: | 5.7.40 | OS: | Windows |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[22 Feb 2022 13:56]
pavel rekun
[23 Feb 2022 7:32]
pavel rekun
few other things i forgot to mention: all disk are ssd. when the issue happens all queries to master gets locked even a simple update/select. slaves performing normally.
[23 Feb 2022 20:17]
MySQL Verification Team
Hi, I cannot reproduce this, there is no data about the problem we can even start to figure out what the problem could be. You need to setup some monitoring tools (MySQL Enterprise Monitor is great monitoring tool) to monitor status of your MySQL server and your OS (disk, ram, cpu..). Usually running MySQL Enterprise Monitor will help you solve this issue.
[24 May 2022 8:26]
pavel rekun
ok. i'm not monitoring the memory usage. the master node keeps growing in memory. currently memory/innodb/std keeps growing (don't see this on the slaves) is that normal? see the output select event_name, current_alloc, high_alloc from sys.memory_global_by_current_bytes where current_count > 0; 'memory/innodb/std', '163.48 MiB', '193.50 MiB' 'memory/sql/String::value', '90.67 MiB', '222.66 MiB' 'memory/performance_schema/events_statements_summary_by_thread_by_event_name', '43.35 MiB', '43.35 MiB' 'memory/performance_schema/table_handles', '36.25 MiB', '36.25 MiB' 'memory/performance_schema/memory_summary_by_thread_by_event_name', '28.12 MiB', '28.12 MiB' 'memory/mysys/IO_CACHE', '24.02 MiB', '311.56 MiB' 'memory/performance_schema/events_statements_history', '17.48 MiB', '17.48 MiB' 'memory/performance_schema/events_statements_current', '17.48 MiB', '17.48 MiB' 'memory/performance_schema/events_waits_summary_by_thread_by_event_name', '16.95 MiB', '16.95 MiB' 'memory/performance_schema/events_statements_history_long', '13.66 MiB', '13.66 MiB' 'memory/performance_schema/events_statements_history.tokens', '12.50 MiB', '12.50 MiB' 'memory/performance_schema/events_statements_history.sqltext', '12.50 MiB', '12.50 MiB' 'memory/performance_schema/events_statements_current.tokens', '12.50 MiB', '12.50 MiB' 'memory/performance_schema/events_statements_current.sqltext', '12.50 MiB', '12.50 MiB' 'memory/performance_schema/events_statements_summary_by_digest.tokens', '9.77 MiB', '9.77 MiB' 'memory/performance_schema/events_statements_history_long.tokens', '9.77 MiB', '9.77 MiB' 'memory/performance_schema/events_statements_history_long.sqltext', '9.77 MiB', '9.77 MiB' 'memory/innodb/row0sel', '8.96 MiB', '2.56 GiB' 'memory/sql/Gtid_set::Interval_chunk', '6.53 MiB', '22.57 MiB' 'memory/performance_schema/events_stages_summary_by_thread_by_event_name', '5.86 MiB', '5.86 MiB'
[24 May 2022 12:56]
pavel rekun
ok going over the "SHOW ENGINE INNODB STATUS;" output i attached before - i noticed that we have dead locks happening - these deadlocks happen only on the master (this explains why the memory leak happens only on the primary node and never on the slaves). So i think as time goes by, as more deadlocks happen on the master the memory leaks. As of now - server runs for 7 days and after 613 deadlock (SELECT `count` FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE NAME="lock_deadlocks";) the server leaked about 1.2gb compared to the slaves. Is this making sense to you? are you aware of such scenario?
[24 May 2022 13:36]
MySQL Verification Team
Hi Pavel, I still can not reproduce. Even forcing deadlocks (a lot of them) I do not see the leaks nor I see the slowdown (I cannot get "select * from users limit 1000" to be slow whatever I do). Three things 1. I think opening a ticket with MySQL Support would be beneficial to you 2. If you can test this with 8.0 would be interesting 3. Gathering more data - namely profile the mysql query ( https://dev.mysql.com/doc/refman/5.7/en/show-profile.html ) this might give us idea what is going on, as even with mem leak you say you have more than 40% free ram so it does not look like mem leak is the problem thanks
[30 May 2022 9:57]
pavel rekun
ok few updates: all deadlocks were basically removed (code side solution) - we'll see if it affects the reported issue. As to the memory leak, I'll rephrase - it's not an actual memory leak but looks like a resource exhaustion within the allowed allocation of the mysql master instance. So in our case each instance it set to 21GB innodb buffer, over time (takes about a week) the memory usage on the master node climbs to about 27.5GB (commit size) and then we start to notice the slowdowns with each queries, while on slaves it stays on 25GB . Both slave and master have about 700 connections at any given time. The only difference load wise is that in the master we run about 5k queries/sec while on slaves around 700 queries/sec. Any obvious setting (settings included in the initial post) that can cause the master run out of resources? machine still plenty of memory (around 20gb free). I also used the memory statistics of mysql - didn't see any memory leak. Other things we changed - #changed from 2GB to 210MB on 25/5/22 (should be 1% of mysql memory) .per connection tmp_table_size=210M max_heap_table_size=210M also noticed strange behavior where several of our apps were constantly closing and reopening connections - also fixed (might be related?)
[30 May 2022 10:04]
pavel rekun
as to your previous points: testing with MySQL 8 - a bit of an issue since we can't reproduce the case it will require us upgrading to mysql 8.0 (not sure if the issue will be solved or even worse) so we'll wait with this until we try all other options (but we'll need to upgrade it at some point) As to the profiler - I'll run it when the issue happens (can takes weeks so lets see) Appreciate the help.
[30 May 2022 11:51]
MySQL Verification Team
Hi, Thanks for the updates. As I mentioned, you would really benefit from the MySQL Support team as they can help you track issue in your code and issue with your configuration and make you get max out of your system. I know MySQL Support is not free but IMHO it is best value for the money in the entire RDBMS ecosystem. Through the bugs system we can only track and fix bugs in MySQL code. I cannot get a simple query like that to be slow no matter what I do so it is important to find out what is making it slow. Query profiler will help, also, check all the server stat (IO and CPU load, RAM status etc..) at the moment of the slow query. kind regards
[1 Jun 2022 6:56]
pavel rekun
SUCCESS :) Managed to reproduce the memory leak (yes, it's indeed a memory leak) Tested with mysql 5.7.36 x64. Jdk 1.7 Basiclly i've set innodb_lock_wait_timeout=1; to speed up the deadlocks and this is the code I've run. with innodb_buffer_pool_size=8M after about 10 hours, mysql instance went from 150MB to 3GB. private static void testMysqlLeak() { int threads = 50; final CountDownLatch latch = new CountDownLatch(threads); final AtomicInteger activeThreads = new AtomicInteger(); while (true) { for (int i = 0; i < threads; i++) { final int threadId = i; activeThreads.incrementAndGet(); new Thread(new Runnable() { public void run() { try { ResultSet res = null; Statement stat = null; Connection con = null; try { String dbURL = "jdbc:mysql://localhost:3307/test"; String username = "root"; String password = "123"; Class.forName("com.mysql.jdbc.Driver"); con = DriverManager.getConnection(dbURL, username, password); con.setTransactionIsolation(Connection.TRANSACTION_REPEATABLE_READ); int i = 0; con.setAutoCommit(false); stat = con.createStatement(); System.out.println("thread:" + threadId); latch.countDown(); latch.await(); if (threadId < 5) { stat.execute("update admin_users set last_login=now() where id=1"); } else { stat.execute("update admin_users set last_login=now() where id=2"); stat.execute("update admin_users set password=now() where id=1"); } sleep(5000); con.commit(); System.out.println("done"); i++; if (i % 1000 == 0) System.out.println("count:" + i); } catch (Exception e) { e.printStackTrace(); } finally { closeResultSet(res); closeStatement(stat); closeConnection(con); activeThreads.decrementAndGet(); } } catch (Exception e) { e.printStackTrace(); } } }).start(); } while (activeThreads.get() != 0) { try { System.out.println("active threads:" + activeThreads.get()); Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } } // System.out.println("done"); } DROP TABLE IF EXISTS `test`.`admin_users`; CREATE TABLE `test`.`admin_users` ( `id` int(11) NOT NULL AUTO_INCREMENT, `user_name` varchar(255) DEFAULT NULL, `magic` varchar(255) DEFAULT NULL, `password` varchar(255) DEFAULT NULL, `last_login` datetime DEFAULT NULL, `session_id` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `uniqe` (`user_name`), KEY `user_name` (`user_name`), KEY `magic` (`magic`), KEY `password` (`password`), KEY `last_login` (`last_login`), KEY `session_id` (`session_id`) ) ENGINE=InnoDB AUTO_INCREMENT=456339 DEFAULT CHARSET=utf8; Let me know if you need anything else on your end to run it. P.S - it has nothing to do with innodb cluster...
[1 Jun 2022 12:01]
pavel rekun
tested on 8.0.29 - NOT reproducible.
[11 Jul 2022 8:55]
MySQL Verification Team
I'm closing this. I cannot reproduce this with latest source/binaries. Since you cannot reproduce with 8.0.29 neither I'm considering this closed thanks
[18 Oct 2022 11:44]
pavel rekun
Hi. Still happening with latest 5.7.40. Just noticed i forgot to add the rows you need in the table to create the actual deadlock. Also let me remind you - testing this on windows - not sure if it happens on linux... insert into test.admin_users(id,user_name) values(1,'user1'); insert into test.admin_users(id,user_name) values(2,'user2'); Please test it again and let me know - this is a very simple test that's easily reproducible each time and exposes a huge issue. If anything further needed please let me know.