Bug #35817 read_set not updated correctly by trigger for some queries
Submitted: 3 Apr 2008 22:48 Modified: 25 Jun 13:02
Reporter: Tim Clark
Status: In progress
Category:Server: SP Severity:S2 (Serious)
Version:5.1.23 OS:Any
Assigned to: Target Version:
Tags: primary key, trigger, read_set
Triage: Triaged: D3 (Medium) / R2 (Low) / E2 (Low)

[3 Apr 2008 22:48] Tim Clark
Description:
The table->read_set bitmap is not updated correctly when optimization based on a primary
key is done and a trigger is invoked. This is a problem for storage engines which rely on
read_set to optimize record fetches, because they may not know to read the fields which
are required by the trigger. Instead, they only fetch the fields that are needed to
satisfy the original query.

Please see "How to repeat:" for the detailed logic involved.

How to repeat:
This was discovered when running against a storage engine that is still under
development. However, the following queries can be executed against MyISAM, and a
debugger can be used to see the incorrect flow.

Incorrect:
Set a breakpoint in ha_myisam::index_read_idx_map(). When t1.id is a PRIMARY KEY, and the
DELETE statement below is executed, (after two spurious halts for the "proc" table), it
can be seen that table->read_set->bitmap is equal to 0x0001 indicating only the first
field (t1.id) is requested. Storage engines which honor read_set will not retrieve the
value of t1.operation and thus will not have the data needed to satisfy the trigger.

Correct:
Set a breakpoint in ha_myisam::index_read_map(). When t1.id is not a primary key (but
covered by an index), and the DELETE statement is executed, it can be seen that
table->read_set->bitmap is equal to 0x0003 indicating both fields are requested.

This can be further confirmed by executing the DELETE statement with a breakpoint also in
Table_triggers_list::mark_fields_used(). It can be seen that when t1.id is a primary key,
mark_fields_used() is called *after* index_read_idx_map() has been called. When t1.id is
a normal index, mark_fields_used() is called *before* index_read_map(). Clearly, the
latter behavior is correct.

# Incorrect logic
create table t1 (id int primary key , operation varchar(255));
# -- OR --
# Correct logic
create table t1 (id int , operation varchar(255), index(id));

create table t2 (id int primary key);
create table t1_op_log(operation varchar(255));
create trigger trg_bd before delete on t1
for each row insert into t1_op_log (operation)
  values (concat("Before DELETE, old=", old.operation));
create trigger trg_ad after delete on t1
for each row insert into t1_op_log (operation)
  values (concat("After DELETE, old=", old.operation));

insert into t1 (id, operation) values (1, "INSERT for multi-DELETE");
insert into t1
values(2,""),(3,""),(4,""),(5,""),(6,""),(7,""),(8,""),(9,""),(10,""),(11,""),(12,""),(13,""),(14,"");
insert into t2 (id) values (1);

delete t1.*, t2.* from t1, t2 where t1.id=1;

Suggested fix:
Table_triggers_list::mark_fields_used() should always be called before doing any reads
from a table participating in a trigger.
[14 Apr 2008 17:02] Susanne Ebrecht
Verified as described.
[25 Jun 14:53] Jon Olav Hauglid
Simplified test case:

--disable_warnings
drop table if exists t1;
drop table if exists t1_op_log;
drop trigger if exists trg_bd;
--enable_warnings

create table t1 (id int primary key , operation varchar(255));
#create table t1 (id int, operation varchar(255), index(id));
create table t1_op_log(operation varchar(255));

create trigger trg_bd before delete on t1
for each row insert into t1_op_log (operation)
  values (concat("Before DELETE, old=", old.operation));

insert into t1 values (1, "INSERT for multi-DELETE"), (2, "");

delete t1.* from t1 where t1.id=1;

drop trigger trg_bd;
drop table t1_op_log;
drop table t1;
[2 Jul 10:53] Konstantin Osipov
Timour, I don't know how to fix it. Either join_read_const shouldn't be run
in presence of triggers, or it should read all involved columns.
Jon Olav will provide background information.
Let's discuss on IRC after that.
[2 Jul 11:52] Jon Olav Hauglid
This problem concerns the optimizer, more specifically the optimization of queries using
constant tables.

In the initial phase of optimization, constant tables are evaluated to find the constant
value in order to put that value into the query. This happens in make_join_statistics()
called from sql_select.cc:1613

At this point in time, the read_set bitmap has not been completely set up to cover
triggers. This happens in mark_fields_used() which is called from initialize_tables()
which is again called from sql_select.cc:1623 (i.e. after 
make_join_statistics())