Bug #76727 Slave assertion in unpack_row with ROLLBACK TO SAVEPOINT in error handler
Submitted: 16 Apr 2015 20:42 Modified: 7 Sep 2015 17:40
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.23,5.6.24, 5.6.25 OS:Linux
Assigned to: CPU Architecture:Any
Tags: handler, replication, rollback, savepoint

[16 Apr 2015 20:42] Philip Stoev
Description:
When issuing a ROLLBACK TO SAVEPOINT within an error handler, the slave asserts as follows:

```
mysqld: /home/philips/git/mysql-wsrep/sql/rpl_record.cc:246: int unpack_row(const Relay_log_info*, TABLE*, uint, const uchar*, const MY_BITMAP*, const uchar**, ulong*, const uchar*): Assertion `table_found' failed.

/lib64/libc.so.6[0x3e9042da82]
/home/philips/git/mysql-wsrep/sql/mysqld(unpack_row(Relay_log_info const*, TABLE*, unsigned int, unsigned char const*, st_bitmap const*, unsigned char const**, unsigned long*, unsigned char const*)+0x283)[0xa55448]
/home/philips/git/mysql-wsrep/sql/mysqld(Rows_log_event::unpack_current_row(Relay_log_info const*, st_bitmap const*)+0xde)[0xa30c2e]
/home/philips/git/mysql-wsrep/sql/mysqld(Write_rows_log_event::write_row(Relay_log_info const*, bool)+0x10e)[0xa2b388]
/home/philips/git/mysql-wsrep/sql/mysqld(Write_rows_log_event::do_exec_row(Relay_log_info const*)+0x115)[0xa2c125]
/home/philips/git/mysql-wsrep/sql/mysqld(Rows_log_event::do_apply_row(Relay_log_info const*)+0x94)[0xa2555a]
/home/philips/git/mysql-wsrep/sql/mysqld(Rows_log_event::do_apply_event(Relay_log_info const*)+0xe4a)[0xa2836a]
/home/philips/git/mysql-wsrep/sql/mysqld(Log_event::apply_event(Relay_log_info*)+0x3bb)[0xa137ad]
/home/philips/git/mysql-wsrep/sql/mysqld(apply_event_and_update_pos(Log_event**, THD*, Relay_log_info*)+0xb79)[0xa6d9bb]
/home/philips/git/mysql-wsrep/sql/mysqld[0xa6eedc]
/home/philips/git/mysql-wsrep/sql/mysqld(handle_slave_sql+0xb1c)[0xa7420b]
/home/philips/git/mysql-wsrep/sql/mysqld(pfs_spawn_thread+0x159)[0xafa5c0]
/lib64/libpthread.so.0[0x3e90c0753a]
/lib64/libc.so.6(clone+0x6d)[0x3e905045fd]
```

How to repeat:
MTR test case:

```
--source include/have_innodb.inc
--source include/have_binlog_format_row.inc
--source include/master-slave.inc

CREATE TABLE t1 (f1 INTEGER PRIMARY KEY);
CREATE TABLE t2 (f1 INTEGER PRIMARY KEY);
CREATE TABLE t3 (f1 INTEGER PRIMARY KEY);

CREATE TRIGGER tr1 AFTER INSERT ON t1 FOR EACH ROW CALL p1(NEW.f1);

DELIMITER |;

CREATE PROCEDURE p1 (IN x INT)
BEGIN
        DECLARE EXIT HANDLER FOR SQLEXCEPTION
                BEGIN
                        ROLLBACK TO event_logging;
                        INSERT t3 VALUES (x);
                END;
        SAVEPOINT event_logging;

        INSERT INTO t2 VALUES (x);

        RELEASE SAVEPOINT event_logging;
END|
DELIMITER ;|

INSERT INTO t2 VALUES (1);
INSERT INTO t1 VALUES (1);
```
[17 Apr 2015 5:00] Shane Bester
Assertion failed: table_found, file ..\..\mysqlcom-pro-5.6.24\sql\rpl_record.cc, line 246

mysqld-debug.exe!my_sigabrt_handler()[my_thr_init.c:458]
mysqld-debug.exe!raise()[winsig.c:593]
mysqld-debug.exe!abort()[abort.c:81]
mysqld-debug.exe!_wassert()[assert.c:153]
mysqld-debug.exe!unpack_row()[rpl_record.cc:246]
mysqld-debug.exe!Rows_log_event::unpack_current_row()[log_event.h:4282]
mysqld-debug.exe!Write_rows_log_event::write_row()[log_event.cc:12659]
mysqld-debug.exe!Write_rows_log_event::do_exec_row()[log_event.cc:12877]
mysqld-debug.exe!Rows_log_event::do_apply_row()[log_event.cc:10311]
mysqld-debug.exe!Rows_log_event::do_apply_event()[log_event.cc:11418]
mysqld-debug.exe!Log_event::do_apply_event_worker()[log_event.cc:853]
mysqld-debug.exe!slave_worker_exec_job()[rpl_rli_pdb.cc:1973]
mysqld-debug.exe!handle_slave_worker()[rpl_slave.cc:4991]
mysqld-debug.exe!pfs_spawn_thread()[pfs.cc:1862]
mysqld-debug.exe!pthread_start()[my_winthread.c:62]
mysqld-debug.exe!_callthreadstartex()[threadex.c:314]
mysqld-debug.exe!_threadstartex()[threadex.c:297]
[17 Apr 2015 7:51] Umesh Shastry
Hello Philip Stoev,

Thank you for the report and test case.
Observed that only debug builds (5.6.23, 5.6.24 and 5.6.25) asserts where as release builds breaks replication with "[ERROR] Slave SQL: Could not execute Write_rows event on table test.t3; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log master-bin.000001, end_log_pos 1553, Error_code: 1105".

Thanks,
Umesh
[17 Apr 2015 7:55] Umesh Shastry
test results

Attachment: 76727.txt (text/plain), 136.58 KiB.

[17 Apr 2015 23:34] Marcos Albe
Hello Umesh,

Is the latter behaviour (error on slave) considered normal/expected behaviour?  The bug also affects Percona XtraDB Cluster and effectively the issue is with applying the binary log event, and not with crashes.
[11 May 2015 12:38] Vlad Lesin
The following explanation is for RBR.

1. Why does the crash take place?

The crash happens in Write_rows_log_event::write_row()  when row field is unpacked. The stack in the following

Relay_log_info::get_table_data()
unpack_row()
unpack_current_row()
Write_rows_log_event::write_row()

Relay_log_info::get_table_data() contains the following code:

  bool get_table_data(TABLE *table_arg, table_def tabledef_var, TABLE conv_table_var) const
  {
     ...
    for (TABLE_LIST *ptr= tables_to_lock ; ptr != NULL ; ptr= ptr->next_global)
      if (ptr->table == table_arg)
      {
         ...
         return true;
      }
     return false;
   }

Relay_log_info::tables_to_lock is NULL, Relay_log_info::get_table_data() returns false and unpack_row() is aborted by assert.

2. Why Relay_log_info::tables_to_lock is NULL.?

Because it was cleared in Query_log_event::do_apply_event()->Relay_log_info::slave_close_thread_tables()->Relay_log_info::clear_tables_to_lock() when log event for "SAVEPOINT" is executed"

3. How table map is written for trigger?

It is written once for the whole trigger body:

a) Table is added to thd->lex->query_tables in open_table_entry_fini() <- the function for opening tables for triggers which are specified for the tables which take part in query

b) thd->lock is filled from lock_tables() with thd->lex->query_tables.lock_tables .

c) binlog_log_row()->write_locked_table_maps() writes table maps to binlog and invoke THD::binlog_write_table_map() which increases THD::binlog_table_maps. 

d) when write_locked_table_maps() is invoked for the next statement in trigger it writes map to binlog only if THD::binlog_table_maps is zero, but it is not zero as it was changed in (c).

Trigger is parsed and executed before DML execution and all tables which take part in query are added into list of open tables.

4. How to fix?

SAVEPOINT, ROLLBACK TO SAVEPOINT(as well as BEGIN, COMMIT, ROLLBACK, but they are forbidden for triggers) statements are logged into binary log as Query_log_event. Query_log_event::do_apply_event() wipes out table map.  And we can not recognize on slave if we are inside of trigger or not as only modification events are written to binary log and there is not event for trigger boddy call. But we can recognize it in master. The idea is to write tables map into binary log after each SAVEPOINT, ROLLBACK TO SAVEPOINT execution on master if it happens inside of trigger.

The patch can be found here:
https://github.com/percona/percona-server/commit/43da58f37f7e6df91b9df5e9f5f75ea475a69546
https://github.com/percona/percona-server/commit/a53b3853cfce2abc22ebc765bf505065f2ce3103
[15 May 2015 11:41] Vlad Lesin
Bug fix for lp:1438990

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: mysql-5.6-bug-1438990.patch (text/x-patch), 21.38 KiB.

[27 May 2015 5:39] Shane Bester
also: http://bugs.mysql.com/bug.php?id=77163
[18 Jul 2015 14:13] Laurynas Biveinis
A simpler testcase. The error handler is not required, SAVEPOINT / ROLLBACK TO SAVEPOINT in the trigger body is enough.

--source include/have_binlog_format_row.inc
--source include/master-slave.inc

CREATE TABLE `t` (
`f1` int(10) unsigned NOT NULL,
PRIMARY KEY (`f1`)
) ENGINE=InnoDB;

--delimiter |
CREATE TRIGGER t_insert_trig AFTER INSERT ON t
FOR EACH ROW
BEGIN

SAVEPOINT savepoint_1;
ROLLBACK TO savepoint_1;

END |
--delimiter ;

INSERT INTO `t` VALUES (2);
INSERT INTO `t` VALUES (3);

--source include/show_binlog_events.inc

SELECT * FROM t;

--source include/sync_slave_sql_with_master.inc

--connection slave
SELECT * FROM t;

--connection master
DROP TABLE t;

--source include/rpl_end.inc
[18 Jul 2015 15:16] Laurynas Biveinis
In the context of the fix Vlad submitted by OCA, this and and the original testcase are not duplicates even though the slave crash stacktrace matches. Both must be handled.
[7 Sep 2015 17:40] Jon Stephens
Documented fix in the MySQL 5.6.27 and 5.7.9 changelogs as follows:

    SAVEPOINT and ROLLBACK TO SAVEPOINT within a trigger led to an
    assertion.  

Closed.