Bug #99315 MySQL hangs during purge binary logs operation
Submitted: 21 Apr 2020 4:28 Modified: 6 May 2020 13:50
Reporter: sampath choudary kamineni Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Logging Severity:S2 (Serious)
Version:5.7.29 OS:CentOS (2.6.32-754.28.1.el6.x86_64)
Assigned to: CPU Architecture:x86

[21 Apr 2020 4:28] sampath choudary kamineni
Description:
While running MySQL 5.7.29 on one of our replicas, replicating from a 5.6 Master . We observed occasional freezes in mysqld. On troubleshooting we found that this is happening when we issue during purge binary logs , so I set expire_logs_days to 0 and the occasional hangs do not happen. 
When I manually issue `purge binary logs to '';`  command , the purge takes around 10 minutes even to purge one binary log file . and mysqld freezes for that period, and goes back to normal once the purge is done .

Server configuration:
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| binlog_gtid_simple_recovery      | OFF   |
| enforce_gtid_consistency         | OFF   |
| gtid_executed                    |       |
| gtid_executed_compression_period | 1000  |
| gtid_mode                        | OFF   |
| gtid_owned                       |       |
| gtid_purged                      |       |
| session_track_gtids              | OFF   |
+----------------------------------+-------+mysql> show global variables like '%bin%';
+--------------------------------------------+--------------------------------+
| Variable_name                              | Value                          |
+--------------------------------------------+--------------------------------+
| bind_address                               | *                              |
| binlog_cache_size                          | 32768                          |
| binlog_checksum                            | CRC32                          |
| binlog_direct_non_transactional_updates    | OFF                            |
| binlog_error_action                        | IGNORE_ERROR                   |
| binlog_format                              | MIXED                          |
| binlog_group_commit_sync_delay             | 0                              |
| binlog_group_commit_sync_no_delay_count    | 0                              |
| binlog_gtid_simple_recovery                | OFF                            |
| binlog_max_flush_queue_time                | 0                              |
| binlog_order_commits                       | OFF                            |
| binlog_row_image                           | FULL                           |
| binlog_rows_query_log_events               | OFF                            |
| binlog_stmt_cache_size                     | 32768                          |
| binlog_transaction_dependency_history_size | 25000                          |
| binlog_transaction_dependency_tracking     | COMMIT_ORDER                   |
| innodb_api_enable_binlog                   | OFF                            |
| innodb_locks_unsafe_for_binlog             | OFF                            |
| log_bin                                    | ON                             |
| log_bin_basename                           | /mnt/ddisk/mysql/bin-log       |
| log_bin_index                              | /mnt/ddisk/mysql/bin-log.index |
| log_bin_trust_function_creators            | OFF                            |
| log_bin_use_v1_row_events                  | OFF                            |
| log_statements_unsafe_for_binlog           | ON                             |
| max_binlog_cache_size                      | 18446744073709547520           |
| max_binlog_size                            | 1073741824                     |
| max_binlog_stmt_cache_size                 | 18446744073709547520           |
| sql_log_bin                                | ON                             |
| sync_binlog                                | 0                              |
+--------------------------------------------+--------------------------------+

Stack Trace using pt-pmp will be attached.

How to repeat:
manually issue a  `purge binary logs to '';`  command .
[21 Apr 2020 4:31] sampath choudary kamineni
trace using pt-pmp

Attachment: st.txt (text/plain), 9.87 KiB.

[21 Apr 2020 4:32] sampath choudary kamineni
full pstack

Attachment: pstack_3605_11 (application/octet-stream, text), 346.69 KiB.

[21 Apr 2020 16:25] MySQL Verification Team
Thank you for the bug report. Duplicate of https://bugs.mysql.com/bug.php?id=83526, https://bugs.mysql.com/bug.php?id=71901,https://bugs.mysql.com/bug.php?id=44150.
[21 Apr 2020 20:40] sampath choudary kamineni
Hi Miguel, Thanks for checking on this. 
I feel like this is not a duplicate of above mentioned bug as in my case, mysql freezes even when I try to purge 1 binary log file manually 
Purge of one binary log file of size 300MB takes around 13 minutes and all connects to mysql freezes for that period.
I have attached the pt-pmp trace for purging one log file here .

Could anyone please help me in figuring out this part from my pt-pmp trace that I attached , I am confused on the check for gtid state as I dont have it enabled.

```
1 read(libpthread.so.0),my_read,_my_b_read,Log_event::read_log_event,read_gtids_from_binlog,MYSQL_BIN_LOG::init_gtid_sets,MYSQL_BIN_LOG::purge_logs,purge_master_logs,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_rwlock_wrlock(libpthread.so.0),Sys_var_gtid_purged::global_value_ptr,sys_var::value_ptr,Item_func_get_system_var::val_str,get_system_var,PTI_variable_aux_3d::itemize,PTI_expr_with_alias::itemize,PT_select_item_list::contextualize,PT_select_options_and_item_list::contextualize,PT_select_part2::contextualize,PT_select_init2::contextualize,PT_select::contextualize,MYSQLparse,parse_sql,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_rwlock_wrlock(libpthread.so.0),show_master_status,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_rwlock_rdlock(libpthread.so.0),Gtid_state::update_gtids_impl,MYSQL_BIN_LOG::finish_commit,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit,Xid_log_event::do_commit,Xid_apply_log_event::do_apply_event,Log_event::apply_event,apply_event_and_update_pos,exec_relay_log_event,handle_slave_sql,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6)
```

Thanks in advance . attachment to follow.
[21 Apr 2020 20:42] sampath choudary kamineni
Pt-pmp trace for purging one logfile.

Attachment: pt-pmp-purge-1-log.txt (text/plain), 11.97 KiB.

[22 Apr 2020 16:19] MySQL Verification Team
Hi,

I am not sure this is the duplicate of the mentioned bugs. 

Dev team is suggesting that this is likely the result of using binlog_gtid_simple_recovery=OFF and gtid_mode=OFF.

On 5.7.29, it is safe to switch to binlog_gtid_simple_recovery=ON, if the oldest binlog file uses 5.7.5 or newer (you can check with binlog). That should solve the problem.

I'm verifying the behavior. Please try the workaround.

all best
Bogdan
[23 Apr 2020 18:57] sampath choudary kamineni
Hi Bogdan , 
Thank you for looking into this. 
I shall update binlog_gtid_simple_recovery --> TRUE and post results here. 

Best,
Sampath Kamineni
[28 Apr 2020 21:01] sampath choudary kamineni
Thank you Bogdan , updating binlog_gtid_simple_recovery to TRUE fixed the issue . 

I Just wanted to confirm , if it is  expected to have this check inspite of having gtid turned off?

Thanks again for the help. Everything works fine now.
[29 Apr 2020 15:12] MySQL Verification Team
Hi,

Thanks for confirming the workaround works. I still think this is a bug, that is just a workaround so that system will work. Now how/when/if this will be fixed I can't say but since there is a workaround I don't think the priority will be very high.

in good health
Bogdan
[6 May 2020 13:50] MySQL Verification Team
Hi,

Final decision is that this is not a bug.

https://dev.mysql.com/doc/refman/5.7/en/replication-options-gtids.html#sysvar_binlog_gtid_... 

- binlog_gtid_simple_recovery=OFF is for compatibility with 5.7.5 and older.
It is documented that this is slow when there are many binary logs and
gtid_mode=OFF. (It is impossible to fix the slowness, since 5.7.5 and older
don't have necessary information in the binary log to make the procedure
fast.)
- binlog_gtid_simple_recovery=ON is the default and does not have this
problem.

all best
Bogdan