| Bug #1721 | False duplicate key warnings on FULLTEXT columns | ||
|---|---|---|---|
| Submitted: | 31 Oct 2003 0:17 | Modified: | 23 Feb 2004 10:45 |
| Reporter: | Tony Butcher | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Server: MyISAM storage engine | Severity: | S2 (Serious) |
| Version: | 4.0.16 | OS: | Linux (Linux SuSE 8.2) |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
[31 Oct 2003 0:20]
Tony Butcher
Corrected version to 4.0.16
[31 Oct 2003 11:15]
Dean Ellis
I am unable to duplicate this behaviour using 4.0.16 or 4.0.17. I am able to duplicate as many binary values for `body` as I like, as one would expect with FULLTEXT indexes not enforcing uniqueness. If you can duplicate this with a new table and a specific 'binary' created for this test, please submit a test case with complete CREATE TABLE and INSERT statements for further testing. Thank you
[12 Nov 2003 1:13]
Sergei Golubchik
Looking at the code I can see that it is could be possible to get "duplicate entry" error for fulltext key. My guess is that it is indeed related to binary data and character set issues - which could result in some weird comparison problems if done incorrectly. In any case, I need a repeatable test case to fix the bug. Next time you'll get this error, please, try to save the row data that caused the error - I think that even inserting this row into an *empty* table (with identical structure) can cause "duplicate entry" error.
[12 Nov 2003 2:35]
Tony Butcher
This error has not recurred (although binaries are still being stored), and I've not had the opportunity to find the offending spam/virus. However, I think it's important to note that I am still getting table corruptions (about 1 per day) in my 'emails' and 'mail_cache' table, which have single FULLTEXT columns. (We even switched machines and this still happens). I will try to find a specific offender if I can, although when the corrupting INSERT happens it succeeds (with a corruption), while the duplicate key error actually fails (so is easier to trap).
[23 Feb 2004 10:45]
Sergei Golubchik
No feedback was provided. The bug is being suspended because we assume that you are no longer experiencing the problem. If this is not the case and you are able to provide the information that was requested earlier, please do so and change the status of the bug back to "Open". Thank you.

Description: We have a webmail system in which emails are stored in a MyISAM table. The table has FULLTEXT indexes to assist with people searching their mail archive. However, emails containing spam are also placed in the 'emails' table with a label of 'trash'. It appears that where there is spam or a virus in the body of an email, it can cause an error along the lines of: Warning: duplicate entry for key 6. 'Key 6' would be the 'body' column, which has a FULLTEXT index. We have also been encountering table corruptions with this system (certainly on 4.0.14) and have just yesterday upgraded to 4.0.16. No more table corruptions yet (CHECK TABLE gives OK), but these INSERT errors persist. How to repeat: emails CREATE TABLE `emails` ( `email_id` int(12) NOT NULL auto_increment, `user_id` int(12) NOT NULL default '0', `mail_from` varchar(255) NOT NULL default '', `mail_to` varchar(255) NOT NULL default '', `mail_cc` varchar(255) NOT NULL default '', `mail_bcc` varchar(255) NOT NULL default '', `priority` enum('2 (High)','3 (Normal)','4 (Low)') default NULL, `subject` varchar(150) NOT NULL default '', `header` text NOT NULL, `mime_type` varchar(50) NOT NULL default '', `body` text, `date` datetime NOT NULL default '0000-00-00 00:00:00', `delete_date` datetime default NULL, `email_size` float NOT NULL default '0', `outgoing` enum('0','1') NOT NULL default '0', `spam_rating` float NOT NULL default '0', `mode` enum('received','sent','saved_sent','trash','saved_trash') NOT NULL default 'received', PRIMARY KEY (`email_id`), KEY `user_id` (`user_id`), FULLTEXT KEY `mail_from` (`mail_from`), FULLTEXT KEY `mail_to` (`mail_to`), FULLTEXT KEY `subject` (`subject`), FULLTEXT KEY `body` (`body`) ) TYPE=MyISAM Then do an 'INSERT INTO emails SET...', where 'body' contains something binary (such as a virus or spam - which may be because the email attachment boundary was not detected). Suggested fix: Currently we are getting around this by dropping all FULLTEXT indexes, and this allows the INSERT to occur without a warning. (We're also considering coding around this, so we make sure we don't store binary data in FULLTEXT columns.)