| Bug #80347 | mysqldump backup restore fails due to invalid FTS_DOC_ID (Error 182 and 1030) | ||
|---|---|---|---|
| Submitted: | 12 Feb 2016 5:02 | Modified: | 8 Nov 2016 13:39 |
| Reporter: | Emanuel Calvo (OCA) | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: FULLTEXT search | Severity: | S2 (Serious) |
| Version: | 5.7.9, 5.7.11 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | mysqldump | ||
[12 Feb 2016 7:28]
MySQL Verification Team
Hello Emanuel, Thank you for the report and test case. Observed this with 5.7.11 build. Thanks, Umesh
[13 Feb 2016 22:35]
Emanuel Calvo
Hi Umesh, I dug a little bit on the bug and traced what's causing this error. The error happens at innobase/row/row0mysql.cc, when checking the difference. The problem looks like that mysqldump does not consider smaller chunks of inserts when backuping FTS_DOC_ID columns. 2016-02-13T22:11:53.125300Z 19 [ERROR] InnoDB: Doc ID 10002 is too big. Its difference with largest used Doc ID 1 cannot exceed or equal to 10000 Regards,
[13 Feb 2016 22:37]
Emanuel Calvo
From the code at innobase/row/row0mysql.cc is clear:
/* Difference between Doc IDs are restricted within
4 bytes integer. See fts_get_encoded_len() */
[12 Oct 2016 16:43]
Emanuel Calvo
Just to confirm this is actually a serious issue, as it does not allow to restore backups using FTS_DOC_ID. Backups are useless for tables using explicit FTS_DOC_ID and with more than 4 bytes integer (rows>9999 if id is sequentially assigned). mysqldump should detect if FTS_DOC_ID is explicit and dump in smaller batches in order to do not overflow the permitted difference limitation imposed per fts_get_encoded_len() (innobase/row/row0mysql.cc).
[1 Nov 2016 2:44]
Brett Gardner
Smaller batches may not necessarily improve things. eg 1. Inserts record which receives FTS_DOC_ID = 100 2. 100,000 inserts are recorded 3. Records with id 101 -> 80000 are then deleted 4. Attempting to load a backup from mysqldump will fail as the gap between record with id 100 and 80001 is more than the allowed gap
[1 Nov 2016 13:01]
Emanuel Calvo
Brett, is expected to have this error if using mysqldump as the split needs to be done at backup time. You can try mydumper/myloader using the queries-per-transaction =< 999. It is a partial workaround, but it should work.
[8 Nov 2016 4:37]
Brett Gardner
I'm confused re your last statement. Isn't it illegal to load the following data FTS_DOC_ID, TextContent 1,"Some content" 100000,"Some other content" as there would be a gap of more than 65,535, regardless of how you attempt to load it? Or do you need to ensure that on reload that the FTS_DOC_ID's are reset, in which case they cannot be used as foreign keys in any other table? This situation is allowed to occur during the initial incremental load / deletion as the rows with the interim values still exist, but are marked as deleted so do not appear in any query results.
[8 Nov 2016 13:39]
Emanuel Calvo
Hi Brett, as specified, the difference between Doc IDs are restricted within 4 bytes integer PER TRANSACTION (the uppercase is not explicit, that's why I added it). You don't need to change the FTS_DOC_ID, although it is a limitation when inserting FTS rows. Regards,

Description: mysqldump/mysql client restore fails when using FTS_DOC_ID in certain cases. The table structure, count and data are the following: master [localhost] {msandbox} (test) > show create table bookContentByLine\G *************************** 1. row *************************** Table: bookContentByLine Create Table: CREATE TABLE `bookContentByLine` ( `FTS_DOC_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `bookid` bigint(20) DEFAULT NULL, `content` text, PRIMARY KEY (`FTS_DOC_ID`), KEY `ftsbookid` (`bookid`), FULLTEXT KEY `ftscontent` (`content`) ) ENGINE=InnoDB AUTO_INCREMENT=145296 DEFAULT CHARSET=latin1 1 row in set (0,00 sec) master [localhost] {msandbox} (test) > select count(*) from bookContentByLine; +----------+ | count(*) | +----------+ | 100732 | +----------+ 1 row in set (0,05 sec) master [localhost] {msandbox} (test) > select FTS_DOC_ID, bookid FROM bookContentByLine WHERE FTS_DOC_ID < 3; +------------+--------+ | FTS_DOC_ID | bookid | +------------+--------+ | 1 | 12242 | | 2 | 12242 | +------------+--------+ 2 rows in set (0,00 sec) Errors: ERROR 182 (HY000) at line 41: Invalid InnoDB FTS Doc ID Error (Code 182): Invalid InnoDB FTS Doc ID Error (Code 1030): Got error 182 from storage engine How to repeat: emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ master/my sqldump -c -t -e test bookContentByLine > dumps/bcl_onlydata.dump emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m bug < dumps/bcl_onlydata.dump ERROR 182 (HY000) at line 42: Invalid InnoDB FTS Doc ID emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ master/my sqldump test bookContentByLine > dumps/bcl.dump emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m bug < dumps/bcl.dump ERROR 182 (HY000) at line 41: Invalid InnoDB FTS Doc ID emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ master/my sqldump --no-data test bookContentByLine | ./m bug emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ master/my sqldump --single-transaction test bookContentByLine | ./m bug ERROR 182 (HY000) at line 41: Invalid InnoDB FTS Doc ID mysqldump: Got errno 32 on write Adding the insert (here only used a few docs for the sake of the length of this post): emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m bug master [localhost] {msandbox} (bug) > INSERT INTO `bookContentByLine` (`FTS_DOC_ID`, `bookid`, `content`) VALUES (1,12242,'\"Project Gutenberg\'s Poems: Three Series, Complete, by Emily Dickinson\"'),(2,12242,'\"This eBook is for the use of anyone anywhere at no cost and with\"'),(3,12242,'\"almost no restrictions whatsoever. You may copy it, give it away or\"'),(4,12242,'\"re-use it under the terms of the Project Gutenberg License included\"'),(5,12242,'\"with this eBook or online at www.gutenberg.net\"'),(6,12242,'\"Title: Poems: Three Series, Complete\"'),(7,12242,'\"Author: Emily Dickinson\"'),(8,12242,'\"Release Date: May 3, 2004 [EBook #12242]\"'),(9,12242,'\"Language: English\"'),(10,12242,'\"*** START OF THIS PROJECT GUTENBERG EBOOK POEMS: THREE SERIES, COMPLETE ***\"'),(11,12242,'\"Produced by Jim Tinsley <jtinsley@pobox.com>\"'),(12,12242,'\"POEMS\"'),(13,12242,'\"by EMILY DICKINSON\"'); Query OK, 13 rows affected (0,02 sec) Records: 13 Duplicates: 0 Warnings: 0 Please use https://github.com/3manuek/FTS_bug to see the dumps generated. Also, if you want to load the data from original sources, use the `load.R` module at https://github.com/3manuek/fts_article/blob/master/load.R . Suggested fix: Doing partial inserts or doing manual inserts works (from the mysql client).