Bug #2518 Replication
Submitted: 26 Jan 2004 11:10 Modified: 21 Mar 2005 22:18
Reporter: Joao Neves Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:4.0.17 OS:Linux (Linux)
Assigned to: Guilhem Bichot CPU Architecture:Any

[26 Jan 2004 11:10] Joao Neves
Description:
I have replication set up for backup purposes on a very busy server, and my slave keeps stopping with errors like this one:

040126 15:54:42  Slave: Error 'Duplicate entry '223538' for key 1' on query 'INSERT INTO cookie set cod_browser_info=374, http_remote_addr="200.186.80.19", cod_cliente=58, cod_ip_info=644'. Default database: 'empauta', Error_code: 1062

The table definition for table 'cookie' is:
CREATE TABLE cookie (
  cod_cookie int(11) NOT NULL auto_increment,
  data timestamp(14) NOT NULL,
  http_remote_addr varchar(20) NOT NULL default '',
  cod_cliente int(11) default NULL,
  cod_ip_info int(11) NOT NULL default '0',
  cod_browser_info int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (cod_cookie),
  KEY cod_cliente (cod_cliente),
  KEY cod_ip_info (cod_ip_info)
) TYPE=MyISAM;

The strange thing is that in the original INSERT, cod_cookie was not specified, therefore it should auto-increment on both the master and slave servers. Every time the slave stops I check the slave for the duplicate key, but it doesn't exist. In the above example, if you issue "select cod_cookie, cod_cliente where cod_cookie=223538" on the slave it returns an empty result set. Apart from this erratic behaviour, which occurs randomly on several tables, my replication works great.

How to repeat:
The strange behaviour occurs randomly.
[26 Jan 2004 12:40] Guilhem Bichot
Hi,
This all looks promising, though it seems hard to repeat for us.
First:
- could you please check your tables on slave: CHECK TABLE ... EXTENDED;
you needn't stop replication; CHECK will lock the table, so replication may be blocked for some time but will resume when CHECK finishes.
- could you please show me the portion of the slave's relay log which is around:
040126 15:54:42  Slave: Error 'Duplicate entry '223538' for key 1' on
query 'INSERT INTO cookie set cod_browser_info=374,
http_remote_addr="200.186.80.19", cod_cliente=58, cod_ip_info=644'.
(I mean around the query above); you can read the relay log with mysqlbinlog;
I would like to see the 20 lines before and 20 lines after the INSERT query.
If the relay log is not available, I would like to see the corresponding part of the master's binary log. This is to see if the reported duplicate value for the key is the one displayed by mysqlbinlog in the "SET INSERT_ID=" command which should be just before the INSERT.
- are you using INSERT DELAYED ?
- are some clients updating the data on your slave ?
- when you notice replication stopped because of the error, did you try to type START SLAVE? did replication restart fine with no error?
- which Linux kernel? are you using our MySQL official binaries or compiled by yourself?
- are tables MyISAM on the master and on the slave? If only MyISAM, I guess you use no BEGIN/COMMIT/ROLLBACK commands, and you run with SET AUTOCOMMIT=1?
- a note: you said "it should auto-increment on both the master and slave
servers": no that's not how it works; it auto-increments on the master, but the master writes into the binary log the value which it used, so that the slave uses the same value.

Thanks,
Guilhem
[26 Jan 2004 13:16] Joao Neves
Hi,
Going down the list....
- I issued 'check table cookie extended' and the result was 'OK';
- The problem just happened again with the following error:
ERROR: 1062  Duplicate entry '223557' for key 1
040126 18:50:23  Slave: Error 'Duplicate entry '223557' for key 1' on query 'INSERT INTO cookie set cod_browser_info=83, http_remote_addr="200.142.58.19", cod_cliente=100, cod_ip_info=545'. Default database: 'empauta', Error_code: 1062

Here is the relay log around the INSERT:

#040126 18:49:05 server id 1  log_pos 173351371         Query   thread_id=66314 exec_time=0     e
rror_code=0
SET TIMESTAMP=1075150145;
delete from noticia_info_hoje where cod_noticia=815541520;
# at 1834784
#040126 18:49:05 server id 1  log_pos 173351466         Query   thread_id=66314 exec_time=0     e
rror_code=0
SET TIMESTAMP=1075150145;
delete from noticia_50 where cod_noticia=815541520;
# at 1834872
#040126 18:49:05 server id 1  log_pos 173351554         Query   thread_id=66314 exec_time=0     e
rror_code=0
use empauta_completo;
SET TIMESTAMP=1075150145;
delete from noticia_completo_200401 where cod_noticia=815541520;
# at 1834982
#040126 18:49:05 server id 1  log_pos 173351664         Query   thread_id=66314 exec_time=0     e
rror_code=0
use empauta;
SET TIMESTAMP=1075150145;
delete from noticia_completo_semana_1 where cod_noticia=815541520;
# at 1835085
#040126 18:49:06 server id 1  log_pos 173351767         Intvar
SET INSERT_ID=223557;
# at 1835113
#040126 18:49:06 server id 1  log_pos 173351795         Query   thread_id=66327 exec_time=0     e
rror_code=0
SET TIMESTAMP=1075150146;
INSERT INTO cookie set
                                                        cod_browser_info=83,
                                                        http_remote_addr="200.142.58.19",
                                                        cod_cliente=100,
                                                        cod_ip_info=545;
# at 1835290
#040126 18:49:06 server id 1  log_pos 173351972         Query   thread_id=66327 exec_time=0     e
rror_code=0
use empauta_log;
SET TIMESTAMP=1075150146;
insert into log_login_2004 set
                                                        cod_usuario=1720,
                                                        cod_cliente=100,
                                                        cod_cookie=223557,
                                                        cod_log_login_msg=5;
# at 1835464
#040126 18:49:06 server id 1  log_pos 173352146         Query   thread_id=66327 exec_time=0     e
rror_code=0
use empauta;
SET TIMESTAMP=1075150146;
insert into cookiexusuario (cod_usuario, cod_cookie) values (1720,223557);
# at 1835575
#040126 18:49:06 server id 1  log_pos 173352257         Query   thread_id=66327 exec_time=0     e
rror_code=0
SET TIMESTAMP=1075150146;

- We DON'T use INSERT DELAYED.
- We DON'T update data on the slave.
- After the error I tried START SLAVE, but got:

040126 19:12:52  Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'replication.009' position 173351767

- I use Linux kernel 2.4.24, gcc 3.2.2, and compile my own binaries using:
CFLAGS="-O6 -march=athlon-xp -fomit-frame-pointer" \
        CXXFLAGS="-O6 -march=athlon-xp -fomit-frame-pointer" \
        CXXFLAGS=$CXXFLAGS" -felide-constructors -fno-exceptions -fno-rtti" \
        ./configure --enable-assembler \
        --without-debug \
        --prefix=/usr/local/mysql \
        --with-client-ldflags=-all-static \
        --with-mysqld-ldflags=-all-static \
        --enable-thread-safe-client \
        --with-charset=latin1 \
        --with-extra-charsets=none

- I only use MyISAM tables.

I hope this is enough information.

Thanks and congrats on the great work!
[26 Jan 2004 13:35] Guilhem Bichot
Thanks for all this info. Looks like you did nothing wrong :)
First, I'd like to see if this could be compilation-related.
There are two tests to do:
1) remove the "-O6 -march=athlon-xp"
from CFLAGS and CXXFLAGS
(note that I am surprised about -O6: gcc features -O0 to -O3, but not -O6 (at least from 'man gcc' I have gcc 3.3.1; and I tested that on my machine, -O6 is accepted but mapped to -O3).
And recompile mysqld with this on the slave, and see if the problem continues.
2) if yes, download our binaries from www.mysql.com and see again.
I know this is some work for you; but I need to narrow the problem the most possible. We have had lots of cases where compilation was the cause (buggy compiler for example).

Thank you in advance!
[28 Jan 2004 10:17] Joao Neves
I owe you a HUGE apology!!! After lots of searching we found an old script that was sending updates to our slave, therefore there is no bug.

Thanks a lot and sorry for the bogus bug.  Keep up the great work!
[28 Jan 2004 10:28] MySQL Verification Team
As user himself recognized, this turned out not to be a bug.
[12 Feb 2004 15:13] Guilhem Bichot
Note that MySQL has an option
--read-only
which will prevent any updating statement (INSERT/UPDATE/DELETE, DROP/CREATE etc) on the server except if this is the slave SQL thread or a user with the SUPER privilege.
To enable it, just add
read-only
to your my.cnf
[21 Mar 2005 21:33] Martin Rautenberg
I have the same problem.
I do exact the same like discribed at http://dev.mysql.com/doc/mysql/en/replication-faq.html
1) binary copy the MYSQL Master files
2) extract the db files at the slave
3) start the slave

When i start the MYSQL Service at the SLAVE (mysqld_safe)
in my mysqld.log i found:
Slave: Error 'Duplicate entry 'user4568' for key 1' on query.
If i delete this entry at the slave i get an egal error in an other table:
Slave: Error 'Duplicate entry 'session15' for key 1' on query.
I look at the Slave, but this entrys are not "Duplicate"
I think the binary log file ist not correct. I search at this file, but i found only one entry for user4568 nothing is double.

SLAVE MYSQL VERSION: 4.0.24-log / Kernel: 2.6.9-041221
MASTER MYSQL VERSION: 4.0.18-log / Kernel: 2.4.21-192-smp4G

at MASTER:
SHOW MASTER STATUS;
+----------------+-----------+--------------+------------------+
| File           | Position  | Binlog_do_db | Binlog_ignore_db |
+----------------+-----------+--------------+------------------+
| master-bin.001 | 215173219 |              |                  |
+----------------+-----------+--------------+------------------+
1 row in set (0.00 sec)

at SLAVE:
SHOW MASTER STATUS;
+----------------+----------+--------------+------------------+
| File           | Position | Binlog_do_db | Binlog_ignore_db |
+----------------+----------+--------------+------------------+
| slave-bin.003 | 4        |              |                  |
+----------------+----------+--------------+------------------+
1 row in set (0.00 sec)

SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
          Master_Host: XXXXXX
          Master_User: root
          Master_Port: 3306
        Connect_retry: 60
      Master_Log_File: master-bin.001
  Read_Master_Log_Pos: 215330142
       Relay_Log_File: slave-relay-bin.001
        Relay_Log_Pos: 1214664
Relay_Master_Log_File: master-bin.001
     Slave_IO_Running: Yes
    Slave_SQL_Running: No
      Replicate_do_db:
  Replicate_ignore_db:
           Last_errno: 1062
           Last_error: Error 'Duplicate entry 'session15' for key 1' on query. D
efault database: 'pretime'. Query: 'INSERT INTO sess ( account , uid , sess) VALUES ('session15', '9886', '7656748768')'
         Skip_counter: 0
  Exec_master_log_pos: 1214623
      Relay_log_space: 215330232
1 row in set (0.00 sec)

Please Help
[21 Mar 2005 22:18] Guilhem Bichot
"3) start the slave" did you start it from the appropriate coordinates ?
In any case, 4.0.18 is a bit old. Please upgrade to latest 4.0 (and we also have 4.1 of production grade now) to make sure this is not an already fixed bug.
[22 Mar 2005 20:03] Martin Rautenberg
Thx
I update both (master / slave) to 4.1.10a-log 
then
1) stop the master
2) stop the slave
2) binary copy the master tarfile (dbs) to slave
3) del the masters binlog and innodb files
4) start the master
5) install binary dbs at slave (extract in the db dir)
6) start the slave
works fine :)
[28 Jul 2007 11:01] ali alkhalidi
have the same error on MySQL-server-community-5.0.45-0.rhel4

070728 10:26:12 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000040' at position 98, relay log '/var/lib/mysql/db1-relay-bin.000032' position: 235
070728 10:26:12 [ERROR] Slave: Error 'Duplicate entry '6363681' for key 1' on query. Default database: 'ads'. Query: 'INSERT INTO impressions2  (aff_id, zone_id , type, ip_address, date )
                VALUES
                ('1243', '56', 'none', '81.10.53.73', '2007-07-21 08:33:06')', Error_code: 1062
070728 10:26:12 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000040' position 98

checked the impressions2 table, and got back with OK

mysql> describe impressions2;
+------------+-----------------------+------+-----+-------------------+----------------+
| Field      | Type                  | Null | Key | Default           | Extra          |
+------------+-----------------------+------+-----+-------------------+----------------+
| id         | bigint(20) unsigned   | NO   | PRI | NULL              | auto_increment |
| aff_id     | bigint(20)            | NO   | MUL | 0                 |                |
| zone_id    | bigint(20)            | NO   |     | 0                 |                |
| type       | set('keyword','none') | NO   |     | none              |                |
| ip_address | varchar(255)          | NO   | MUL |                   |                |
| date       | timestamp             | NO   |     | CURRENT_TIMESTAMP |                |
+------------+-----------------------+------+-----+-------------------+----------------+

impressions2 is innoDB

master sent all logs to salve, so no binlogs on master.

slave, pile all binlogs, and first is:
db1-relay-bin.000032

this relay log starts here:

mysqlbinlog db1-relay-bin.000032 | head -60
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#070721  8:36:14 server id 30  end_log_pos 98   Start: binlog v 4, server v 5.0.45-community created 070721  8:36:14
# at 98
#700101  0:00:00 server id 1  end_log_pos 0     Rotate to mysql-bin.000040  pos: 4
# at 141
#070721  8:35:31 server id 1  end_log_pos 98    Start: binlog v 4, server v 5.0.45-community-log created 070721  8:35:31 at startup
ROLLBACK/*!*/;
# at 235
#070721  8:35:31 server id 1  end_log_pos 28    Intvar
SET INSERT_ID=6363681/*!*/;
# at 263
#070721  8:35:31 server id 1  end_log_pos 241   Query   thread_id=17    exec_time=0     error_code=0
use ads/*!*/;
SET TIMESTAMP=1185006931/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1/*!*/;
SET @@session.sql_mode=0/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.time_zone='SYSTEM'/*!*/;
INSERT INTO impressions2        (aff_id, zone_id , type, ip_address, date )
                VALUES
                ('1243', '56', 'none', '81.10.53.73', '2007-07-21 08:33:06')/*!*/;
# at 476
#070721  8:35:31 server id 1  end_log_pos 366   Xid = 34
COMMIT/*!*/;
# at 503
#070721  8:35:31 server id 1  end_log_pos 28    Intvar
SET INSERT_ID=6363682/*!*/;
# at 531
#070721  8:35:31 server id 1  end_log_pos 244   Query   thread_id=31    exec_time=0     error_code=0
SET TIMESTAMP=1185006931/*!*/;
INSERT INTO impressions2        (aff_id, zone_id , type, ip_address, date )
                VALUES
                ('1243', '56', 'none', '213.212.194.86', '2007-07-21 08:33:06')/*!*/;
# at 747
#070721  8:35:31 server id 1  end_log_pos 637   Xid = 128
COMMIT/*!*/;
# at 774
#070721  8:35:31 server id 1  end_log_pos 28    Intvar
SET INSERT_ID=6363683/*!*/;
# at 802
#070721  8:35:31 server id 1  end_log_pos 244   Query   thread_id=7     exec_time=0     error_code=0
SET TIMESTAMP=1185006931/*!*/;
INSERT INTO impressions2        (aff_id, zone_id , type, ip_address, date )
                VALUES
                ('2357', '59', 'none', '195.229.242.54', '2007-07-21 08:33:06')/*!*/;
# at 1018
#070721  8:35:31 server id 1  end_log_pos 908   Xid = 178
COMMIT/*!*/;
# at 1045
#070721  8:35:31 server id 1  end_log_pos 28    Intvar
SET INSERT_ID=6363684/*!*/;
# at 1073
#070721  8:35:31 server id 1  end_log_pos 242   Query   thread_id=24    exec_time=0     error_code=0
SET TIMESTAMP=1185006931/*!*/;
INSERT INTO impressions2        (aff_id, zone_id , type, ip_address, date )
                VALUES

checked that there are no crons/users are writing to slave; after all the INSERT statement is there following the id of the master, which is 1 (slave is 30)

Also, duplicate entry does exists

appreciate your help,

ali