Bug #117003 | Memory usage increases and reaches the limit, causing the pod to restart | ||
---|---|---|---|
Submitted: | 17 Dec 2024 19:25 | Modified: | 13 Feb 20:33 |
Reporter: | serhii hladchenko | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Operator | Severity: | S3 (Non-critical) |
Version: | 8.4.3 | OS: | Ubuntu (ubuntu-2204-jammy-v20230919) |
Assigned to: | CPU Architecture: | x86 | |
Tags: | mysql-operator memory usage restart |
[17 Dec 2024 19:25]
serhii hladchenko
[27 Jan 13:00]
MySQL Verification Team
Hi Serhii, I have issues reproducing this with 8.4.4. I do not see anything between 8.4.3 an 8.4.4 that could affect this but can you please try if you can reproduce this with 8.4.4 too ?
[28 Jan 9:15]
serhii hladchenko
Hi, We have just updated the MySQL server 8.4.4. Keep an eye on it. I will text in 2-3 days
[31 Jan 9:56]
serhii hladchenko
I have installed community-server 8.4.4. The pod starts with about 800 MB of memory usage, reaches the 1.5 GB limit within two days, and reboots. It seems that this issue has not been handled in the new version. State: Running Started: Thu, 30 Jan 2025 01:23:04 +0200 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Tue, 28 Jan 2025 11:30:18 +0200 Finished: Thu, 30 Jan 2025 01:22:55 +0200 Our configuration: Kubernetes cluster is k3s k3s-master1 Ready control-plane,master 128d v1.29.9+k3s1 Pod Images container-registry.oracle.com/mysql/community-operator:9.1.0-2.2.2 container-registry.oracle.com/mysql/community-router:8.4.4 container-registry.oracle.com/mysql/community-server:8.4.4 Additionally, I installed MySQL v8.4.4 today in my sandbox without any applications connected to it. In this configuration, MySQL 8.4.3 consumes memory. I will monitor version 8.4.4.
[31 Jan 18:23]
MySQL Verification Team
This does not look like a bug. How many connections do you have on your MySQL.
[31 Jan 18:58]
serhii hladchenko
The MySQL server has 40 connections. I have all MySQL dashboards in Kibana. It is a development environment, and we don't have a high load. Memory usage is increasing every hour, and when it reaches the pod limit, the pod crashes. Screenshots are here https://drive.google.com/drive/folders/17ax-4hUh2XSlKmwHfKb2m4EIkWgletpO?usp=sharing
[31 Jan 19:46]
MySQL Verification Team
Please check SYS schema to monitor RAM usage to see why/where the RAM is used up. 1G is pretty small limit. https://dev.mysql.com/doc/refman/8.4/en/monitor-mysql-memory-use.html Also use https://www.mysqlcalculator.com/ to calculate what amount of memory your MySQL server can allocate according to your config.
[31 Jan 20:24]
serhii hladchenko
I set limit 1500 Mb. I have loaded some screenshots from MySQL Workbench. https://drive.google.com/file/d/1bgncYBrcbebjZyaU6Ce1NESiCUkQSMAN/view?usp=drive_link
[31 Jan 20:29]
serhii hladchenko
We have a small amount of connection about 13-15. Max_connection is 40. We run very load instance Mysql in Google Cloud SQL Service that consumes less 1Gb.
[31 Jan 20:44]
serhii hladchenko
and We run our application on a bare metal server with huge memory, and Mysql usage memory is increase too. And It is reaching the limit of 4Gb. Seems our MySQL server can reach any limit.
[4 Feb 9:48]
MySQL Verification Team
Hi, MySQL can easily use hundreds of gigabytes of RAM that is not a problem. It all depends on how you configure it. As I wrote earlier, please use https://www.mysqlcalculator.com/ to calculate your MySQL memory usage. Or let me know this: select @@key_buffer_size + @@tmp_table_size + @@innodb_buffer_pool_size + @@innodb_log_buffer_size; select @@sort_buffer_size + @@read_buffer_size + @@read_rnd_buffer_size + @@join_buffer_size + @@thread_stack + @@binlog_cache_size ; Thanks
[4 Feb 10:23]
serhii hladchenko
+---------------------------------------------------------------------------------------------+ | @@key_buffer_size + @@tmp_table_size + @@innodb_buffer_pool_size + @@innodb_log_buffer_size | +---------------------------------------------------------------------------------------------+ | 629145600 | +---------------------------------------------------------------------------------------------+ +------------------------------------------------------------------------------------------------------------------------------+ | @@sort_buffer_size + @@read_buffer_size + @@read_rnd_buffer_size + @@join_buffer_size + @@thread_stack + @@binlog_cache_size | +------------------------------------------------------------------------------------------------------------------------------+ | 1998848 | +------------------------------------------------------------------------------------------------------------------------------+
[4 Feb 10:31]
MySQL Verification Team
Hi, This is showing some 700MB of use, not more. Not sure where is it leaking memory. I cannot reproduce this, for me this works normally. You will have to monitor memory usage as described in https://dev.mysql.com/doc/refman/8.4/en/monitor-mysql-memory-use.html so we can try to find what is going on. logging through time and especially before it crashes something like this: SELECT SUBSTRING_INDEX(event_name,'/',2) AS code_area, FORMAT_BYTES(SUM(current_alloc)) AS current_alloc FROM sys.x$memory_global_by_current_bytes GROUP BY SUBSTRING_INDEX(event_name,'/',2) ORDER BY SUM(current_alloc) DESC; This should show where the memory is going.
[4 Feb 10:35]
serhii hladchenko
We have deployed MySQL Server 8.4.4 on an empty K3s cluster without any load. It started using 509 MB and reached 515 MB in 3 days. We have also deployed MySQL Server on a bare-metal server with a large amount of memory and application load. It reached 2 GB in 3.5 days. Additionally, we have deployed MySQL 8.4.4 in a development environment with a pod limit of 1.5 GB. Since we don’t have much memory in development, it hit the limit and crashed due to an OOM kill in 2 days.
[4 Feb 10:37]
serhii hladchenko
-> ORDER BY SUM(current_alloc) DESC; +---------------------------+---------------+ | code_area | current_alloc | +---------------------------+---------------+ | memory/innodb | 673.72 MiB | | memory/performance_schema | 235.76 MiB | | memory/group_rpl | 125.64 MiB | | memory/sql | 28.17 MiB | | memory/temptable | 12.00 MiB | | memory/mysys | 9.48 MiB | | memory/mysqld_openssl | 1.41 MiB | | memory/refcache | 37.95 KiB | | memory/mysqlx | 23.70 KiB | | memory/vio | 2.25 KiB | | memory/myisam | 728 bytes | | memory/csv | 120 bytes | | memory/blackhole | 120 bytes | +---------------------------+---------------+
[4 Feb 10:42]
serhii hladchenko
You said ~700 MB, but the kubectl command shows more.sgladc@C07ZF2EMJYW0:~$ kubectl top pod mysql-0 --containers POD NAME CPU(cores) MEMORY(bytes) mysql-0 mysql 47m 963Mi mysql-0 sidecar 0m 130Mi
[4 Feb 18:24]
serhii hladchenko
Eight hours have passed. The pod consumes more than 100 MB of memory, and memory/group_rpl has increased. +---------------------------------------------------------------------------------------------+ | @@key_buffer_size + @@tmp_table_size + @@innodb_buffer_pool_size + @@innodb_log_buffer_size | +---------------------------------------------------------------------------------------------+ | 629145600 | +---------------------------------------------------------------------------------------------+ +------------------------------------------------------------------------------------------------------------------------------+ | @@sort_buffer_size + @@read_buffer_size + @@read_rnd_buffer_size + @@join_buffer_size + @@thread_stack + @@binlog_cache_size | +------------------------------------------------------------------------------------------------------------------------------+ | 1998848 | +------------------------------------------------------------------------------------------------------------------------------+ +---------------------------+---------------+ | code_area | current_alloc | +---------------------------+---------------+ | memory/innodb | 683.86 MiB | | memory/performance_schema | 253.89 MiB | | memory/group_rpl | 235.00 MiB | | memory/sql | 37.04 MiB | | memory/mysys | 9.42 MiB | | memory/temptable | 7.00 MiB | | memory/mysqld_openssl | 1.04 MiB | | memory/refcache | 36.30 KiB | | memory/mysqlx | 23.70 KiB | | memory/vio | 1.16 KiB | | memory/myisam | 728 bytes | | memory/csv | 120 bytes | | memory/blackhole | 120 bytes | +---------------------------+---------------+ sgladc@C07ZF2EMJYW0:~$ kubectl top pod mysql-0 --containers POD NAME CPU(cores) MEMORY(bytes) mysql-0 mysql 40m 1127Mi mysql-0 sidecar 0m 145Mi
[4 Feb 18:34]
serhii hladchenko
Total Memory Used in Group Replication MySQL localhost:3306 ssl SQL > SELECT * FROM ( -> SELECT -> (CASE -> WHEN EVENT_NAME LIKE 'memory/group_rpl/%' -> THEN 'memory/group_rpl/memory_gr' -> ELSE 'memory_gr_rest' -> END) AS EVENT_NAME, -> SUM(COUNT_ALLOC), SUM(COUNT_FREE), -> SUM(SUM_NUMBER_OF_BYTES_ALLOC), -> SUM(SUM_NUMBER_OF_BYTES_FREE), SUM(LOW_COUNT_USED), -> SUM(CURRENT_COUNT_USED), SUM(HIGH_COUNT_USED), -> SUM(LOW_NUMBER_OF_BYTES_USED), SUM(CURRENT_NUMBER_OF_BYTES_USED), -> SUM(HIGH_NUMBER_OF_BYTES_USED) -> FROM performance_schema.memory_summary_global_by_event_name -> GROUP BY (CASE -> WHEN EVENT_NAME LIKE 'memory/group_rpl/%' -> THEN 'memory/group_rpl/memory_gr' -> ELSE 'memory_gr_rest' -> END) -> ) f -> WHERE f.EVENT_NAME != 'memory_gr_rest'\G *************************** 1. row *************************** EVENT_NAME: memory/group_rpl/memory_gr SUM(COUNT_ALLOC): 957109 SUM(COUNT_FREE): 858876 SUM(SUM_NUMBER_OF_BYTES_ALLOC): 1094666442 SUM(SUM_NUMBER_OF_BYTES_FREE): 845341985 SUM(LOW_COUNT_USED): 0 SUM(CURRENT_COUNT_USED): 98233 SUM(HIGH_COUNT_USED): 98423 SUM(LOW_NUMBER_OF_BYTES_USED): 0 SUM(CURRENT_NUMBER_OF_BYTES_USED): 249324457 SUM(HIGH_NUMBER_OF_BYTES_USED): 249403622 1 row in set (0.0198 sec)
[4 Feb 18:36]
serhii hladchenko
Could you explain how to identify the memory consumer? I found the following query examples in the documentation: Memory Used to Capture Transactions Memory Used to Broadcast Transactions Total Memory Used in Group Replication Memory Used in Certification Memory Used in Certification Memory Used in Replication Pipeline Memory Used in Consistency Memory Used in Delivery Message Service Memory Used to Broadcast and Receive Transactions
[4 Feb 19:01]
serhii hladchenko
We have executed a SELECT query on the production server. The application is running but without any load. I see that memory/group_rpl is significantly high. code_area,current_alloc memory/group_rpl,"1023.99 MiB" memory/innodb,"280.93 MiB" memory/performance_schema,"235.83 MiB" memory/sql,"32.02 MiB" memory/mysys,"9.43 MiB" memory/temptable,"4.00 MiB" memory/mysqld_openssl,"869.23 KiB" memory/refcache,"41.24 KiB" memory/mysqlx,"23.70 KiB" memory/vio,"1.16 KiB" memory/myisam," 728 bytes" memory/csv," 120 bytes" memory/blackhole," 120 bytes"
[4 Feb 19:04]
serhii hladchenko
We have only one server without any replication in both dev and prod. Can I disable the entire replication group, or do I need to investigate the issue further? memory/group_rpl/write_set_encoded | YES | | memory/group_rpl/certification_data | YES | | memory/group_rpl/certification_data_gc | YES | | memory/group_rpl/certification_info | YES | | memory/group_rpl/transaction_data | YES | | memory/group_rpl/sql_service_command_data | YES | | memory/group_rpl/mysql_thread_queued_task | YES | | memory/group_rpl/message_service_queue | YES | | memory/group_rpl/message_service_received_message | YES | | memory/group_rpl/group_member_info | YES | | memory/group_rpl/consistent_members_that_must_prepare_transaction | YES | | memory/group_rpl/consistent_transactions | YES | | memory/group_rpl/consistent_transactions_prepared | YES | | memory/group_rpl/consistent_transactions_waiting | YES | | memory/group_rpl/consistent_transactions_delayed_view_change | YES | | memory/group_rpl/GCS_XCom::xcom_cache | YES | | memory/group_rpl/Gcs_message_data::m_buffer | YES |
[5 Feb 8:03]
serhii hladchenko
It is a development environment. memory/group_rpl has increased by 200 MB in 12 hours. +---------------------------+---------------+ | code_area | current_alloc | +---------------------------+---------------+ | memory/innodb | 689.24 MiB | | memory/group_rpl | 434.45 MiB | | memory/performance_schema | 254.58 MiB | | memory/sql | 43.02 MiB | | memory/mysys | 9.46 MiB | | memory/temptable | 7.00 MiB | | memory/mysqld_openssl | 1019.70 KiB | | memory/refcache | 37.95 KiB | | memory/mysqlx | 23.70 KiB | | memory/vio | 2.25 KiB | | memory/myisam | 728 bytes | | memory/csv | 120 bytes | | memory/blackhole | 120 bytes | +---------------------------+---------------+
[5 Feb 8:14]
serhii hladchenko
Could you explain how to identify the memory consumer within memory/group_rpl? If I am using only one server, is it possible to disable Group Replication Instrumentation?
[5 Feb 15:19]
MySQL Verification Team
Hi, I am not sure if this is a bug. You are running mysql-innodbcluster-2.2.2 image but there is no innodb cluster configured. I will verify this as I think it should not behave like this even if you do use mysql-innodbcluster-2.2.2 but if you are not using group replication / innodb cluster you should use regular mysql:9.2
[5 Feb 19:54]
serhii hladchenko
Hi, yes I am using mysql-innodbcluster-2.2.2.tgz. Here is values.yaml credentials: root: user: password: host: "%" serverVersion: 8.4.4 tls: useSelfSigned: true serverInstances: 1 router: instances: 1 podSpec: containers: - name: mysql resources: requests: memory: "1Gi" limits: memory: "1536Mi" datadirVolumeClaimTemplate: storageClassName: standard resources: requests: storage: 10Gi backupProfiles: - name: dump-instance dumpInstance: dumpOptions: excludeTables: - outofschool.statisticreportdata storage: persistentVolumeClaim: claimName: mysql-backup-pvc backupSchedules: - name: backup-schedule schedule: "0 2 * * *" deleteBackupData: true backupProfileName: dump-instance enabled: true serverConfig: mycnf: | [mysqld] default-time-zone = 'Europe/Kiev' innodb_buffer_pool_size=512M innodb_buffer_pool_instances=1 innodb_log_file_size=50M max_connections=150 innodb_log_buffer_size=64M max_allowed_packet=256M
[5 Feb 20:05]
serhii hladchenko
Can I disable the replication variable group_replication_start_on_boot=OFF if I don't use it?
[13 Feb 20:33]
serhii hladchenko
Are you checking something? Any results? I asked if we can disable replication.