Description:
My goal was to find corresponding rows from 2 access_logs of 2 sites one of which redirects to the other. The access_logs were loaded into tables and processed, and in order to find corresponding rows I tried to use a multi-table update which worked correctly but left some rows unchanged while reporting that it did change them.
How to repeat:
In order to present a test case I made the tables as simple and small as possible. There are just 2 tables log1 and log2 with similar columns. The query is supposed to update the column log2.PrevLogId to log1.Id for the rows it finds matching. The tables are:
CREATE TABLE `log1` (
`Id` int(10) unsigned NOT NULL auto_increment,
`IP` varchar(16) NOT NULL default '',
PRIMARY KEY (`Id`)
) ENGINE=MyISAM;
and
CREATE TABLE `log2` (
`Id` int(10) unsigned NOT NULL auto_increment,
`IP` varchar(16) NOT NULL default '',
`PrevLogId` int(11) NOT NULL default '0',
PRIMARY KEY (`Id`),
KEY `IP` (`IP`)
);
The values are:
INSERT INTO `log1` VALUES (154,'66.230.191.90'),(155,'66.230.191.90')
and
INSERT INTO `log2` VALUES (877,'66.230.191.90',-877),(213,'66.230.191.90',-213);
that's just 2 rows per table to make it simple. The rows should match, but the point is they all match to each other (see select in the end). The update query is:
update log2 set PrevLogId=-id;
(this is just to reset the log2 table) and then:
Update Ignore log1 as PrevLog, log2 as Log
Set Log.PrevLogId=PrevLog.Id
Where Log.IP=PrevLog.IP And Log.PrevLogId<0;
This produces:
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2 Changed: 2 Warnings: 0
but:
select * from log2;
+-----+---------------+-----------+
| Id | IP | PrevLogId |
+-----+---------------+-----------+
| 877 | 66.230.191.90 | 154 |
| 213 | 66.230.191.90 | -213 |
+-----+---------------+-----------+
that's one row is left unchanged.
The query matches rows by IP, and as you can see they are the same in all 4 rows. In order to prevent MySql from updating 2 rows in log2 for a row in log1 I do the following:
1. unique index on PrevLogId (that is why it is negative first)
2. use Log.PrevLogId<0 to exclude changed rows from matching once again
The ignore lets MySql continue without warnings, but the problem is the second row is never updated, no matter how many times I repeat the query. It then says that just 1 row is updated, but nothing changes again. If I issue a select with the same conditions:
update log2 set PrevLogId=-id;
(reset)
Select *
From log1 as PrevLog, log2 as Log
Where Log.IP=PrevLog.IP And Log.PrevLogId<0;
I get
+-----+---------------+-----+---------------+-----------+
| Id | IP | Id | IP | PrevLogId |
+-----+---------------+-----+---------------+-----------+
| 154 | 66.230.191.90 | 877 | 66.230.191.90 | -877 |
| 154 | 66.230.191.90 | 213 | 66.230.191.90 | -213 |
| 155 | 66.230.191.90 | 877 | 66.230.191.90 | -877 |
| 155 | 66.230.191.90 | 213 | 66.230.191.90 | -213 |
+-----+---------------+-----+---------------+-----------+
it show the right data.
Suggested fix:
Also, are there any plans for extending multi-table update syntax to allow for limiting the number of updated rows in the second (any) table to just one without having to use a unique index?