Bug #19157 | Replication stops with Solaris machine | ||
---|---|---|---|
Submitted: | 18 Apr 2006 2:51 | Modified: | 2 Jul 2006 16:03 |
Reporter: | Haroon Anwar | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 4.1.18 | OS: | Linux (RHEL(Mast) Solaris(Slave)) |
Assigned to: | CPU Architecture: | Any |
[18 Apr 2006 2:51]
Haroon Anwar
[18 Apr 2006 12:02]
Valeriy Kravchuk
Thank you for a problem report. Please, send your my.cnf files from master and both slaves.
[19 Apr 2006 0:16]
Haroon Anwar
Master my.cnf (RHEL 4.0 Master) ----------------------------------------------- [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock back_log = 50 max_connections = 100 max_connect_errors = 10 table_cache = 2048 max_allowed_packet = 16M binlog_cache_size = 1M max_heap_table_size = 64M sort_buffer_size = 8M join_buffer_size = 8M thread_cache = 8 thread_concurrency = 40 query_cache_size = 64M query_cache_limit = 2M ft_min_word_len = 4 default_table_type = InnoDB thread_stack = 192K transaction_isolation = REPEATABLE-READ tmp_table_size = 64M log_bin log_slow_queries long_query_time = 2 log_long_format tmpdir = /var/tmp/ server-id = 1 key_buffer_size = 32M read_buffer_size = 2M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M myisam_sort_buffer_size = 128M myisam_max_sort_file_size = 10G myisam_max_extra_sort_file_size = 10G myisam_repair_threads = 1 myisam_recover skip-bdb innodb_additional_mem_pool_size = 16M innodb_buffer_pool_size = 2000M innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend innodb_data_home_dir = /var/lib/mysql/ibdata/ innodb_file_io_threads = 4 innodb_thread_concurrency = 16 innodb_flush_log_at_trx_commit = 1 innodb_log_buffer_size = 8M innodb_log_file_size = 1500M innodb_log_files_in_group = 2 innodb_log_group_home_dir = /var/lib/mysql/iblogs/ innodb_max_dirty_pages_pct = 90 innodb_flush_method=O_DIRECT innodb_lock_wait_timeout = 120 datadir = /var/lib/mysql lower_case_table_names = 1 innodb_file_per_table = 1 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [myisamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [mysqlhotcopy] interactive-timeout [mysqld_safe] open-files-limit = 8192 Slave1 my.cnf (RHEL 4.0 Slave) ---------------------------------------------- [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock skip-locking key_buffer = 16M max_allowed_packet = 16M table_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M server-id = 2 innodb_data_home_dir = /var/lib/mysql/ibdata innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend innodb_log_group_home_dir = /var/lib/mysql/iblogs innodb_buffer_pool_size = 16M innodb_additional_mem_pool_size = 2M innodb_log_file_size = 1500M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 50 innodb_log_files_in_group = 2 datadir = /var/lib/mysql lower_case_table_names = 1 innodb_file_per_table = 1 tmpdir = /var/spool/mysql [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout Slave2 my.cnf (Running Solaris 10) -------------------------------------------------- [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock skip-locking key_buffer = 16M max_allowed_packet = 16M table_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M server-id = 3 relay-log=bentleigh-replica-relay-bin log-warnings innodb_data_home_dir = /var/lib/mysql/ibdata innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend innodb_log_group_home_dir = /var/lib/mysql/iblogs innodb_buffer_pool_size = 16M innodb_additional_mem_pool_size = 2M innodb_log_file_size = 1500M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 50 innodb_log_files_in_group = 2 datadir = /var/lib/mysql lower_case_table_names = 1 innodb_file_per_table = 1 tmpdir = /var/spool/mysql [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
[19 Apr 2006 6:03]
Valeriy Kravchuk
Thank you for the additional information. I had found a difference in my.cnf on master and (both) slaves. On master you have: read_buffer_size = 2M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M On both slaves you have: read_buffer_size = 256K read_rnd_buffer_size = 512K and no bulk_insert_buffer_size. Not sure it can lead to the problem you described, but, anyway, please send the SHOW VARIABLES LIKE 'bulk%'; from both your slaves. Just to check for (maybe, different) defaults. Is there anything in Solaris slave's error log for the appropriate period? Your version is noted as 4.1.18 in "header" and 4.0.18 in your description. Please, clerify what version is really used.
[19 Apr 2006 8:02]
Haroon Anwar
Hi, The version that I am using is 4.1.18. My mistake. Sorry about that. Further to help you, I upgraded my solaris version to 5.0.20 to rule any possibility of bug in older releases but the situation is still the same. To rule out the possiblity that this problem might be caused by a WAN connection, I installed version 4.1.18 in a solaris machine on a LAN and then tested replication with this machine. It was able to go thorugh and was in sync with master later on. So, I believe this problem is related to a WAN connection. May be the problem is that when we send a file as attachment to the slaves over a slow connection, if the slave cannot execute a event for say x amount of time, then it restarts this event and it is now a endless loop. I am just guessing. Further, to transfer a file of 2M on my connection, it takes around 1.5 minutes. But on LAN it takes around .5 seconds. May be this further info. help you sort out the problem. Please find below the further info. requested. Slave 1 (RHEL 4.0) ----------------------------- mysql> SHOW VARIABLES LIKE 'bulk%'; +-------------------------+---------+ | Variable_name | Value | +-------------------------+---------+ | bulk_insert_buffer_size | 8388608 | +-------------------------+---------+ 1 row in set (0.00 sec) Slave 2 (Solaris 10) ----------------------------- mysql> SHOW VARIABLES LIKE 'bulk%'; +-------------------------+---------+ | Variable_name | Value | +-------------------------+---------+ | bulk_insert_buffer_size | 8388608 | +-------------------------+---------+ 1 row in set (0.01 sec)
[19 Apr 2006 8:53]
Valeriy Kravchuk
Thank you for the additional information and tests. I've got no answer to the following question yet: Is there anything in Solaris slave's error log for the appropriate period? Send the show variables like 'slave%'; results from the problematic slave also.
[19 Apr 2006 9:04]
Haroon Anwar
Hi Valeriy, Many thanks for the quick reply. I have checked error log manytimes but to my surprise there is nothing in there. No error message at all. The last few lines say connected to master on xxx position. With no error message and all the things behaving normally, I don't know where to start troubleshooting. Anyways, please find below the additional information requested. mysql> show variables like 'slave%'; +---------------------------+-------------------+ | Variable_name | Value | +---------------------------+-------------------+ | slave_compressed_protocol | OFF | | slave_load_tmpdir | /var/spool/mysql/ | | slave_net_timeout | 3600 | | slave_skip_errors | OFF | | slave_transaction_retries | 10 | +---------------------------+-------------------+ 5 rows in set (0.00 sec) Cheers
[24 Apr 2006 1:40]
Haroon Anwar
Hi, Was just checking about the status of this query. Is it a bug? Can you please let me know. Thanks
[2 Jun 2006 16:03]
Valeriy Kravchuk
I am not sure about the status. I have no ideas on how to repeat and what is the reason, really. Please, send show variables like 'slave%'; results from local RH slave. Send the uname -a results from your master and both slaves.
[2 Jul 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".