MySQL Bugs: #54560: Docs for REPLACE are not accurate

Bug #54560	Docs for REPLACE are not accurate
Submitted:	16 Jun 2010 18:58	Modified:	22 Jun 2010 18:28
Reporter:	Mark Callaghan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.1, 4.1, 5.0, 5.6.99	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	innodb, REPLACE

Description:
The docs for REPLACE at http://dev.mysql.com/doc/refman/5.1/en/replace.html are not accurate. The docs state that:

 MySQL uses the following algorithm for REPLACE (and LOAD DATA ... REPLACE):

   1. Try to insert the new row into the table
   2. While the insertion fails because a duplicate-key error occurs for a primary key or unique index:
     1. Delete from the table the conflicting row that has the duplicate key value
     2. Try again to insert the new row into the table

But from sql/sql_insert.cc there are optimizations to do update rather than delete on duplicate keys:

      else /* DUP_REPLACE */
      {
        /*
          The manual defines the REPLACE semantics that it is either
          an INSERT or DELETE(s) + INSERT; FOREIGN KEY checks in
          InnoDB do not function in the defined way if we allow MySQL
          to convert the latter operation internally to an UPDATE.
          We also should not perform this conversion if we have
          timestamp field with ON UPDATE which is different from DEFAULT.
          Another case when conversion should not be performed is when
          we have ON DELETE trigger on table so user may notice that
          we cheat here. Note that it is ok to do such conversion for
          tables which have ON UPDATE but have no ON DELETE triggers,
          we just should not expose this fact to users by invoking
          ON UPDATE triggers.
        */
        if (last_uniq_key(table,key_nr) &&
            !table->file->referenced_by_foreign_key() &&
            (table->timestamp_field_type == TIMESTAMP_NO_AUTO_SET ||
             table->timestamp_field_type == TIMESTAMP_AUTO_SET_ON_BOTH) &&
            (!table->triggers || !table->triggers->has_delete_triggers()))
        {
          if ((error=table->file->ha_update_row(table->record[1],
                                                table->record[0])) &&
              error != HA_ERR_RECORD_IS_THE_SAME)
            goto err;
          if (error != HA_ERR_RECORD_IS_THE_SAME)
            info->deleted++;
          else
            error= 0;
          thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
          /*
            Since we pretend that we have done insert we should call
            its after triggers.
          */
          goto after_trg_n_copied_inc;
        }
        else
        {
          if (table->triggers &&
              table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
                                                TRG_ACTION_BEFORE, TRUE))
            goto before_trg_err;

How to repeat:
read the code, ask someone form the runtime team to explain it

Thank you for the report.

Verified as described.

Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

After discussing with Runtime, added this to REPLACE section:

It is possible that in the case of a duplicate-key error, a storage
engine may perform the REPLACE as an update rather than a delete plus
insert, but the semantics are the same. There are no user-visible
effects other than a possible difference in how the storage engine
increments Handler_xxx status variables.