Some more observations from October 14th on Debian2: Lots of checking table & waiting... State Info Checking table INSERT INTO `aboTrackEmails200510` (id, email, subject, datumZeit, headers, body) VALUES ( Waiting for table INSERT INTO `aboTrackEmails200510` (id, email, subject, datumZeit, headers, body) VALUES ( Waiting for table INSERT INTO `aboTrackEmails200510` (id, email, subject, datumZeit, headers, body) VALUES ( Waiting for table INSERT INTO `aboTrackEmails200510` (id, email, subject, datumZeit, headers, body) VALUES ( Waiting for table INSERT INTO `aboTrackEmails200510` (id, email, subject, datumZeit, headers, body) VALUES ( Checking table DELETE FROM pfrd WHERE anbieter = '0' /* Link_ID Resource id No.12 | Host_Master pz-server1.pf Waiting for table SELECT * FROM pfrd WHERE id = '35439' /* Link_ID Resource id #13 | Host_Slave localhost | SERVE statistics SELECT idRasse FROM rassen WHERE bezeichnung = 'Zuchtverbände' AND kurz > '' /* Link_ID Resou Checking table SELECT idNews, titelNews FROM news WHERE idNews = '8067' /* Link_ID Resource id #13 | Host_Slave Updating UPDATE page_counter SET counter = counter + 1 WHERE pagename = 'google13051' /* Link_ID Reso Locked UPDATE page_counter SET counter = counter + 1 WHERE pagename = 'google13050' /* Link_ID Reso Locked SELECT counter FROM page_counter  WHERE pagename = 'pfrd7-2005-10-14' /*firstElement*/ /* Link_ Updating UPDATE anbieter_log_0510 SET log468 = log468 + 1 WHERE idAnbieter = '22' /* Link_ID Resource Locked UPDATE anbieter_log_0510 SET logKennenSieSchon = logKennenSieSchon + 1 WHERE idAnbieter = '2 statistics SELECT count(*) num FROM sessData WHERE datum > DATE_SUB(NOW(), INTERVAL 30 MINUTE) /* Link_ID Resou statistics SELECT varsVal FROM sessData WHERE sid = '6edb0371f228675b2655817f0ff8d0b31129309483' /* Link_ID  statistics SELECT varsVal FROM sessData WHERE sid = 'dc9941e2f9738d045e538402110df1291126690855' /* Link_ID     NULL      NULL      NULL   Reading from net    NULL      NULL   SHOW PROCESSLIST Has sent all binlog to slave; waiting for binlog to be updated    NULL   Has sent all binlog to slave; waiting for binlog to be updated    NULL   Has sent all binlog to slave; waiting for binlog to be updated    NULL   Has sent all binlog to slave; waiting for binlog to be updated    NULL   This looks as if the tables had problems and were checked automatically. Hence all statements have to wait - no problem. But here we have a Select Locked without apparent reason ... State Info Has sent all binlog to slave; waiting for binlog to be updated    NULL   Has sent all binlog to slave; waiting for binlog to be updated    NULL   Has sent all binlog to slave; waiting for binlog to be updated    NULL   Locked SELECT counter FROM page_counter WHERE pagename = 'AnzahlReitbeteiligung' /*firstElement*/ /* Li Locked REPLACE INTO sessData (sid, varsVal) VALUES ('6edb0371f228675b2655817f0ff8d0b31129309483', ' Checking table (SELECT idEdSc, keySc  FROM secPage WHERE 1 AND (( MATCH (itemSc) AGAINST ('haflinger') ))  ORDE Updating UPDATE page_counter SET counter = counter + 1 WHERE pagename = 'google13051' /* Link_ID Reso statistics SELECT  SQL_CALC_FOUND_ROWS   * FROM comments WHERE url = 'rubrikenFormOptionenSql5' ORDER BY statistics SELECT idPedigree FROM pedigree WHERE idReference = '34167' AND typReference = 'Pfrdanzeige' statistics SELECT varsVal FROM sessData WHERE sid = '6424105a527761ae2abef69854a99f0c1129308620' /* Link_ID     NULL      NULL      NULL      NULL      NULL      NULL      NULL      NULL   Writing to net    NULL      NULL   SHOW PROCESSLIST The statements giving rise to a lock are really trivial from an explain point of view. There should not be a lock at all as it should resolve in no time. MySQL master>explain select * FROM pfrd WHERE anbieter = '0' ; +-------+------+---------------+----------+---------+-------+------+-------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+----------+---------+-------+------+-------------+ | pfrd | ref | anbieter | anbieter | 2 | const | 1 | Using where | +-------+------+---------------+----------+---------+-------+------+-------------+ 1 row in set (0.00 sec) MySQL master>explain select * from anbieter_log_0510 WHERE idAnbieter = '22' -> ; +-------------------+-------+---------------+---------+---------+-------+------+-------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------------------+-------+---------------+---------+---------+-------+------+-------+ | anbieter_log_0510 | const | PRIMARY | PRIMARY | 3 | const | 1 | | +-------------------+-------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.00 sec) Now new stuff from October 16th: The addresses are ***.***.116.25 and ***.***.119.56 Linux 2.4.31 It looks like we have a stable test situation. Another test: ALTER TABLE `robinson` CHANGE `email` `email` VARCHAR(99) NOT NULL; Before: MySQL Debian>desc robinson; +--------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------+------------------+------+-----+---------+----------------+ | id_r | int(10) unsigned | | PRI | NULL | auto_increment | | email | varchar(100) | | UNI | | | | tmstmp | timestamp(14) | YES | | NULL | | +--------+------------------+------+-----+---------+----------------+ 3 rows in set (0.04 sec) After: MySQL Debian>desc robinson; +--------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------+------------------+------+-----+---------+----------------+ | id_r | int(10) unsigned | | PRI | NULL | auto_increment | | email | varchar(100) | | UNI | | | | tmstmp | timestamp(14) | YES | | NULL | | +--------+------------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) Before: Debian2>desc robinson; +--------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------+------------------+------+-----+---------+----------------+ | id_r | int(10) unsigned | | PRI | NULL | auto_increment | | email | varchar(100) | | UNI | | | | tmstmp | timestamp(14) | YES | | NULL | | +--------+------------------+------+-----+---------+----------------+ 3 rows in set (0.02 sec) After: Debian2>desc robinson; +--------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------+------------------+------+-----+---------+----------------+ | id_r | int(10) unsigned | | PRI | NULL | auto_increment | | email | varchar(99) | | UNI | | | | tmstmp | timestamp(14) | YES | | NULL | | +--------+------------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) I could not capture anything significant in processlist - of course, the ALTER TABLE statement is processed in no time. User Host db Command Time State *** localhost *** Query 0 Writing to net *** localhost *** Query 19 Sorting result *** localhost *** Query 8 Sending data *** localhost *** Sleep 0 *** localhost *** Sleep 1 *** localhost *** Sleep 26 *** localhost *** Sleep 28780 *** pz-server1.***.com:9132 *** Sleep 1 *** pz-server2.***.com:19430 *** Sleep 1297 *** pz-server2.***.com:19841 *** Sleep 1063 *** pz-server2.***.com:20575 *** Sleep 756 *** pz-server2.***.com:20730 *** Sleep 671 *** pz-server2.***.com:22022 *** Sleep 1 *** pz-server2.***.com:22060 *** Sleep 0 *** pz-server2.***.com:22071 *** Sleep 1 *** pz-server2.***.com:22074 *** Sleep 0 *** pz-server2.***.com:22077 *** Sleep 5 *** pz-server2.***.com:22080 *** Query 24 Sorting result *** pz-server2.***.com:22083 *** Sleep 10 *** pz-server2.***.com:22087 *** Query 21 Sorting result *** pz-server2.***.com:22099 *** Sleep 1 *** pz-server2.***.com:22116 *** Sleep 1 *** pz-server2.***.com:22120 *** Sleep 1 *** pz-server2.***.com:22129 *** Sleep 1 *** pz-server2.***.com:22143 *** Sleep 1 *** pz-server2.***.com:22147 *** Sleep 1 *** pz-server2.***.com:22150 *** Sleep 1 *** pz-server2.***.com:22154 *** Sleep 1 *** server******11625.***.info:42639 *** Query 9 Sorting result DELAYED *** Delayed_insert 11 Waiting for INSERT DELAYED *** Delayed_insert 12 Waiting for INSERT repl ***.***.119.56:47196     NULL   Binlog Dump 155937 Has sent all binlog to slave; waiting for binlog to be updated repl p548774B8.dip.t-dialin.net:3053     NULL   Binlog Dump 59821 Has sent all binlog to slave; waiting for binlog to be updated repl pz-server2.***.com:17154     NULL   Binlog Dump 6436 Has sent all binlog to slave; waiting for binlog to be updated repl server******11625.***.info:42768 NULL   Binlog Dump 0 Has sent all binlog to slave; waiting for binlog to be updated root localhost mysql Query 0     NULL   unauthenticated user server******11625.***.info:42769    NULL   Connect    NULL   Reading from net As you see, I have introduced DELAYED for counter processing to resolve the locking issues. Also, we have lots of connections from pz- server2, some of them sorting results, i.e. they are SELECTs which should go to the slave and not to the master... funny Okay, found the bug in my load balancing scheme. I tested on the position of the string select, and these were UNIONs, so the string started with '(SELECT' whence I sent it to the master... Right now a duplicate key error on REPLACE INTO ... on SuSE2, detected by my monitoring system. Master_Host: pz-server1.***.com Master_User: repl Master_Port: 3306 Connect_retry: 60 Master_Log_File: pz-server1-bin.731 Read_Master_Log_Pos: 121227671 Relay_Log_File: pz-server2-relay-bin.005 Relay_Log_Pos: 36949112 Relay_Master_Log_File: pz-server1-bin.731 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_do_db: Replicate_ignore_db: Last_errno: 1062 Last_error: Error 'Duplicate entry '2756445' for key 1' on query. Default database: '***'. Query: 'REPLACE INTO staticStrings SET ruri = 'rubrikenFormOptionenSql-updateCounter', staticString = '34732,35476,35300,35475,29972,29957,29974,34233,33335,35384,33983,32332,31760,31761,31565,35474,33548,33549,33550,35472,29066,35473,35468,32084,29965,29509,29508,28259,27544,24353,35033,34516,34241,20439,20275,25274,33663,33872,34547,21417,21712,35237,28128,35238,30971,31534,34927,35471,26539,34630' /* Link_ID Resource id #13 | connType Host_Slave localhost | SERVER_ADDR ***.***.83.74*/' Skip_counter: Exec_master_log_pos: 121186559 Relay_log_space: 36990224 Hypothesis: table corrupt. SuSE 22>check table staticStrings; +-----------------------------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------------------------+-------+----------+----------+ | *************.staticStrings | check | status | OK | +-----------------------------+-------+----------+----------+ 1 row in set (0.15 sec) Well, how can this be? Duplicate entry on REPLACE INTO and the table is healthy??? Next alert: MySQL Debian>REPLACE DELAYED INTO page_counter (pagename, counter) VALUES('AnzahlMesseseiten', '68'); ERROR 1031: Table handler for 'page_counter' doesn't have this option Huh? Debian2>REPLACE DELAYED INTO page_counter (pagename, counter) VALUES('AnzahlMesseseiten', '68'); Query OK, 1 row affected (0.00 sec) SuSE 22>REPLACE DELAYED INTO page_counter (pagename, counter) VALUES('AnzahlMesseseiten', '68'); Query OK, 1 row affected (0.00 sec) MySQL XP>REPLACE DELAYED INTO page_counter (pagename, counter) VALUES('AnzahlMesseseiten', '68'); Query OK, 1 row affected (0.00 sec) Ok, table must be corrupt! MySQL Debian>check table page_counter ; +----------------------------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------------------------+-------+----------+----------+ | *************.page_counter | check | status | OK | +----------------------------+-------+----------+----------+ 1 row in set (0.08 sec) MySQL Debian>repair table page_counter; +----------------------------+--------+----------+--------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +----------------------------+--------+----------+--------------------------------------------------+ | *************.page_counter | repair | error | The handler for the table doesn't support repair | +----------------------------+--------+----------+--------------------------------------------------+ 1 row in set (0.00 sec) MySQL Debian>show create table page_counter; +--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | page_counter | CREATE TABLE `page_counter` ( `pagename` varchar(127) NOT NULL default '', `counter` int(11) NOT NULL default '0', PRIMARY KEY (`pagename`) ) TYPE=InnoDB PACK_KEYS=1 | +--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) Oh, I see, I have changed type here. Hm. MySQL Debian>alter table page_counter type=myisam; Query OK, 16325 rows affected (0.38 sec) Records: 16325 Duplicates: 0 Warnings: 0 MySQL Debian>REPLACE DELAYED INTO page_counter (pagename, counter) VALUES('AnzahlMesseseiten', '68'); Query OK, 1 row affected (0.00 sec) According to http://dev.mysql.com/doc/refman/5.0/en/replace.html, there should be no problem with REPLACE DELAYED INTO and InnoDB. Nevertheless, I changed my code. http://dev.mysql.com/doc/refman/5.0/en/insert-delayed.html INSERT DELAYED works only with MyISAM, MEMORY, and ARCHIVE tables. Well, this indicates that there is a problem here, but then either the manual is not correct or we have a different problem. I introduced that syntax days ago, and used it numerous times, so it should work on InnoDB tables. MySQL Debian>alter table robinson type=innodb; Query OK, 4 rows affected (0.05 sec) Records: 4 Duplicates: 0 Warnings: 0 MySQL Debian>REPLACE DELAYED INTO robinson (email) VALUES('test@test.com'); ERROR 1031: Table handler for 'robinson' doesn't have this option Hm, looks like I'm wrong. The manual should be updated. Well, have to return to my urgent work now. Good luck!