Bug #26758 SLAVE Replication SQL Thread Crashes DataNode(s)
Submitted: 1 Mar 2007 17:16 Modified: 4 Jun 2007 8:37
Reporter: Steve Edwards Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.16 OS:Linux (Debian Linux 64bit)
Assigned to: Guangbao Ni CPU Architecture:Any
Tags: cluster, Error: 2341 datamemory index, replication, sql thread

[1 Mar 2007 17:16] Steve Edwards
Description:
Replication SQL Thread on SLAVE cluster crashes datanode when writing updates replicated from Master cluster.

# SLAVE REPLICATION MYSQLD ERROR LOG

070301 16:48:25 [ERROR] Slave: Error 'Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER' on query. Default database: ''. Query: 'COMMIT', Error_code: 1297
070301 16:48:25 [ERROR] /var/mysql/bin/mysqld: Incorrect information in file: './spp/spp.frm'
070301 16:48:25 [ERROR] Slave: Error 'Got error 4009 'Cluster Failure' from NDBCLUSTER' on opening table `spp`.`spp`, Error_code: 1296
070301 16:48:25 [Warning] Slave: Got error 4009 'Cluster Failure' from NDB Error_code: 1296
070301 16:48:25 [Warning] Slave: Got error 4009 'Cluster Failure' from NDBCLUSTER Error_code: 1296
070301 16:48:25 [Warning] Slave: Incorrect information in file: './spp/spp.frm' Error_code: 1033
070301 16:48:25 [Warning] Slave: Unknown error Error_code: 1105
070301 16:48:25 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'replication01-bin.000001' position 76277

# CLUSTER [SLAVE] ERROR LOG

2007-03-01 16:48:24 [MgmSrvr] ALERT    -- Node 1: Node 3 Disconnected
2007-03-01 16:48:25 [MgmSrvr] ALERT    -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

# DATANODE #3 [SLAVE] ERROR LOG

Time: Thursday 1 Mars 2007 - 16:48:24
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dbtup/DbtupRoutines.cpp
Error object: DBTUP (Line: 477) 0x0000000a
Program: ndbd
Pid: 10882
Trace: /SPP/data/ndb_3_trace.log.4
Version: Version 5.1.15 (beta)
***EOM***

Time: Thursday 1 Mars 2007 - 17:27:11
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: backup/Backup.cpp
Error object: BACKUP (Line: 3658) 0x0000000a
Program: ndbd
Pid: 14617
Trace: /SPP/data/ndb_3_trace.log.5
Version: Version 5.1.15 (beta)
***EOM***

# MASTER CLUSTER configuration (4 data nodes / 2 x replica) -

Connected to Management Server at: 10.XXX.XX.XXX:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=2    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 0, Master)
id=3    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 0)
id=4    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 1)
id=5    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.XXX.XX.XXX  (Version: 5.1.15)

[mysqld(API)]   24 node(s)
id=6    @10.XXX.XX.XXX  (Version: 5.1.15)
id=7    @10.XXX.XX.XXX  (Version: 5.1.15)
id=8    @10.XXX.XX.XXX  (Version: 5.1.15)
...

# SLAVE CLUSTER configuration (2 data nodes / 1 x replica) -

Connected to Management Server at: 10.XXX.XX.XXX:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 0, Master)
id=3    @10.XXX.XX.XXX  (Version: 5.1.15, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.XXX.XX.XXX  (Version: 5.1.15)

[mysqld(API)]   2 node(s)
id=4    @10.XXX.XX.XXX  (Version: 5.1.15)
id=5    @10.XXX.XX.XXX  (Version: 5.1.15)

# Table schema -
+-------+---------------------+------+-----+-------------------+----------------+
| Field | Type                | Null | Key | Default           | Extra          |
+-------+---------------------+------+-----+-------------------+----------------+
| id    | bigint(20) unsigned | NO   | PRI | NULL              | auto_increment |
| id2   | varchar(54)         | NO   | UNI |                   |                |
| s01   | int(11)             | NO   |     | 0                 |                |
| s02   | int(11)             | NO   |     | 0                 |                |
| s03   | int(11)             | NO   |     | 0                 |                |
| s04   | int(11)             | NO   |     | 0                 |                |
| s05   | int(11)             | NO   |     | 0                 |                |
| s06   | int(11)             | NO   |     | 0                 |                |
| s07   | int(11)             | NO   |     | 0                 |                |
| s08   | int(11)             | NO   |     | 0                 |                |
| s09   | int(11)             | NO   |     | 0                 |                |
| s10   | int(11)             | NO   |     | 0                 |                |
| t1    | int(11)             | NO   |     | 0                 |                |
| t2    | tinyint(1)          | NO   |     | 0                 |                |
| ts    | timestamp           | NO   |     | CURRENT_TIMESTAMP |                |
+-------+---------------------+------+-----+-------------------+----------------+

How to repeat:
1.  Backup MASTER CLUSTER (4xdata node / 2 x replica)
2.  Restore backup to SLAVE CLUSTER (2xdata node / 1 x replica)
3.  START SLAVE MYSQLD replication
4.  Start updates on Master cluster
[2 Mar 2007 15:52] Steve Edwards
Further investigation reveals this fault seems to be a function of replicating between a master cluster with 2 replicas / 4 nodes to a cluster with 1 replica / 2 nodes.  Having reduced the size of the database to enable 2 replicas on the SLAVE i have been unable to reproduce this error.

Have downgraded the severity to non-critical and will investigate further.
[2 Mar 2007 17:14] Steve Edwards
Testing Replication MASTER -> SLAVE (2 replicas) we encountered the same error as before.  Replication caused the non master data node in the slave cluster to crash after replicating for ~15mins.

This appears to be the same issue as bug id=25598

[SLAVE CLUSTER - ERROR LOG]

2007-03-02 18:04:06 [MgmSrvr] ALERT    -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

[NON MASTER DATE NODE ERROR LOG]

Time: Friday 2 Mars 2007 - 18:04:05
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dbtup/DbtupRoutines.cpp
Error object: DBTUP (Line: 477) 0x0000000e
Program: ndbd
Pid: 11166
Trace: /SPP/data/ndb_3_trace.log.2
Version: Version 5.1.15 (beta)
***EOM***
[5 Mar 2007 12:39] Steve Edwards
Upgraded slave cluster to 5.1.16 - so far unable to reproduce this issue.
[6 Mar 2007 8:57] Steve Edwards
Master -> Slave replication between clusters in 5.1.16 seems more functional than in 5.1.15 but was still extremely unstable during our tests.  Slave cluster data nodes crashed repeatedly especially when replicating > 200 reqs/sec (see errors and trace files below) and during replication Datamemory useage increased on slave data nodes until exhausted indicating what looks like some sort of memory leak.

SLAVE CLUSTER DATA NODE ERRORS -

Time: Monday 5 Mars 2007 - 16:51:11
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: backup/Backup.cpp
Error object: BACKUP (Line: 3670) 0x0000000e
Program: ndbd
Pid: 17054
Trace: /SPP/data/ndb_3_trace.log.9
Version: Version 5.1.16 (beta)
***EOM***

Time: Monday 5 Mars 2007 - 17:18:24
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dblqh/DblqhMain.cpp
Error object: DBLQH (Line: 6909) 0x0000000a
Program: ndbd
Pid: 18233
Trace: /SPP/data/ndb_3_trace.log.10
Version: Version 5.1.16 (beta)
***EOM***

Time: Tuesday 6 Mars 2007 - 01:28:07
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dbtup/DbtupRoutines.cpp
Error object: DBTUP (Line: 477) 0x0000000e
Program: ndbd
Pid: 31488
Trace: /SPP/data/ndb_3_trace.log.11
Version: Version 5.1.16 (beta)
***EOM***
[6 Mar 2007 9:05] Steve Edwards
Trace files from crashed Slave cluster data node

Attachment: ndb_error_trace.tar.gz (application/x-gzip, text), 391.85 KiB.

[6 Mar 2007 9:16] Steve Edwards
Slave cluster increase in Datamemory during replication.

Attachment: slave_cluster_datamemory.jpg (image/jpeg, text), 151.03 KiB.

[8 Mar 2007 14:27] Steve Edwards
The memory leak issue (and probably the data node crashes) during replication seem to be related to a bug in the slaves update handling function replace_record() -

According to the replication mysqld trace file replicated UPDATE statements appear to always be executed as a delete and insert operation.  This would seemingly explain why datamemory use on the slave increases.

I think the problem occurs due to table having two unique indexes.  The conditional last_uniq_key() wrapped around the call to ha_update_row() never evaluates to true so the update is always executed as a delete / insert.

./sql/log_event.cc

replace_record()

6664   DESCRIPTION
6665       Similar to how it is done in mysql_insert(), we first try to do
6666       a ha_write_row() and of that fails due to duplicated keys (or
6667       indices), we do an ha_update_row() or a ha_delete_row() instead.
6668  */

6743     /*
6744        REPLACE is defined as either INSERT or DELETE + INSERT.  If
6745        possible, we can replace it with an UPDATE, but that will not
6746        work on InnoDB if FOREIGN KEY checks are necessary.
6747
6748        I (Matz) am not sure of the reason for the last_uniq_key()
6749        check as, but I'm guessing that it's something along the
6750        following lines.
6751
6752        Suppose that we got the duplicate key to be a key that is not
6753        the last unique key for the table and we perform an update:
6754        then there might be another key for which the unique check will
6755        fail, so we're better off just deleting the row and inserting
6756        the correct row.
6757      */
6758     if (last_uniq_key(table, keynum) &&
6759         !table->file->referenced_by_foreign_key())
6760     {
6761       error=table->file->ha_update_row(table->record[1],
6762                                        table->record[0]);
6763       DBUG_RETURN(error);
6764     }
6765     else
6766     {
6767       if ((error= table->file->ha_delete_row(table->record[1])))
6768         DBUG_RETURN(error);
6769       /* Will retry ha_write_row() with the offending row removed. */
6770     }
6771   }

./sql/sql_insert.cc

1042  /* Check if there is more uniq keys after field */
1043
1044 static int last_uniq_key(TABLE *table,uint keynr)
1045 {
1046   while (++keynr < table->s->keys)
1047     if (table->key_info[keynr].flags & HA_NOSAME)
1048       return 0;
1049   return 1;
1050 }
[8 Mar 2007 14:28] Steve Edwards
replication mysqld debug trace update row

Attachment: cluster_repli_update_row.trace (application/octet-stream, text), 34.25 KiB.

[16 Mar 2007 15:27] Steve Edwards
Removing the call to last_unique_key() causes the replicated update to be correctly performed as a replace (NdbOperation::updateTuple()) rather than a delete / write.

Cluster assures that all tables have a unique PK and replication events appear to be written to / replayed from the binlog by PK.  

If a cluster table has additional unique indexes last_uniq_key() on PK fails and update is performed as a delete / write.  Memory allocated for the deleted row or write buffer does not appear to be free'd properly (until data nodes are restarted).

vi ./sql/log_event.cc
6758     //if (last_uniq_key(table, keynum) &&
6759     if (!table->file->referenced_by_foreign_key())
6760     {
6761       error=table->file->ha_update_row(table->record[1],
6762                                        table->record[0]);
6763       DBUG_RETURN(error);
6764     }
6765     else
6766     {
6767       if ((error= table->file->ha_delete_row(table->record[1])))

I'm not fluent enough with the codebase to propose this as a patch - but i thought the additional info might be useful to the programmer who looks into this.
[16 Mar 2007 15:29] Steve Edwards
mysqld slave replication trace file - call to last_uniq_key removed - update correctly performed as replace

Attachment: update_row_postfix.trace (application/octet-stream, text), 16.16 KiB.

[8 May 2007 14:41] Jonas Oreland
Hi,

Can you also upload you config.ini?

Can you also possibly upload a binlog which reproduces this problem
(and schemas)

/Jonas
[4 Jun 2007 8:37] Guangbao Ni
with the 5.1.18  source codes,  i can't reproduced it.