Bug #12166 Unable to index very large tables
Submitted: 25 Jul 2005 23:03 Modified: 31 Oct 2005 19:08
Reporter: Serhiy Polyakov Email Updates:
Status: Closed Impact on me:
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.15-BK, 5.0.9-beta-Max 64-bit OS:Linux (Linux Fedora Core 4)
Assigned to: Ingo Strüwing

[25 Jul 2005 23:03] Serhiy Polyakov
Bug report: unable to index very large tables.

I have this problem on the tables of 26GB (996,010,215 rows) and larger. No problems on the table of 2.7GB (56,177,383 rows).

System: CPU - Opteron 64-bit, RAM - 4GB, OS - Fedora Core 4, file system for /var/lib/mysql - xfs, MySQL 5.0.9-beta-Max 64-bit.

I have this problem bouth on MySQL 5.0.9-beta-Max and "not Max" server.

I will have 2 users and they will need SELECT only (no update or delete). At this moment I do not have any users.

I increased all buffers to max but anyway computer never uses more than 3GB of RAM... mostly 1.5GB.

my.cnf file:
max_allowed_packet = 128M
binlog_cache_size = 1M
sort_buffer_size = 2048M
join_buffer_size = 128M
thread_concurrency = 2
query_cache_size = 640M
query_cache_limit = 32M
thread_stack = 512K
tmp_table_size = 640M
key_buffer_size = 1024M
read_buffer_size = 320M
read_rnd_buffer_size = 512M
bulk_insert_buffer_size = 640M
myisam_sort_buffer_size = 3072M
myisam_max_sort_file_size = 300G
myisam_max_extra_sort_file_size = 100G

Tables are not compressed and do not have any indexes. I need only regular indexes (no primary no fulltext, etc.) Data types are CHAR, VARCHAR and variations of INT.

No problems when I run indexing on 2.7GB table:
ALTER TABLE ‘TableName’ ADD INDEX (ColumnName);
it creates copies of files like:
#sql-7ec-21.MYI (1024 bytes originally)
does processing in memory, write index into #sql-7ec-21.MYI, copies those files back to original files:
NO problems here.

When I run indexing on 26GB table (996,010,215 rows):
ALTER TABLE ‘TableName’ ADD INDEX (ColumnName);
mysql performs the same steps except after copying TableName.MYD into #sql-7ec-21.MYD it starts building  #sql-7ec-21.TMD file and builds it in parallel with building #sql-7ec-21.MYI file

After 20 hours or so:
ERROR 1034 (HY000): Number of rows changed from 522360787 to 996010215
The size of #sql-7ec-21.TMD was almost equal (or equal) to the size of #sql-7ec-21.MYD file at this moment. MySQL deletes all temporary files at this moment and stops the task.

The second way I tried to index the big table: I created empty table which has the same structure as original table I wanted to index and added indexes to the empty table.
INSERT INTO Table_Indexed (Column1, Column2,...) SELECT (Column1, Column2,...) FROM Table_NotIndexed;

MySQL builds Table_Indexed.MYD file, then after its size equals the size of Table_NotIndexed.MYD it starts building Table_Indexed.MYI and Table_Indexed.TMD files. The same error after 84 hours of work:
ERROR 1034 (HY000): Number of rows changed from 522360787 to 996010215

Table_Indexed.MYI files became 75GB in size and has not been deleted when error happened. MySQL shows that index exists but cardinality is NULL (phpmyadmin show cardinality none for the indexes). I guess this index is invalid (not completed)

Please let me know if I need to add more details.

How to repeat:
Have a table of ~> 20GB and try to index at least one column.
[26 Jul 2005 5:11] Serhiy Polyakov
I forgot to add that I am using MyISAM tables. I will try to convert to InnoDB.

[28 Jul 2005 18:54] Jorge del Conde

I was unable to reproduce this problem.  Can you please send me a reproducible test case that shows this behaviour ?


BTW, I tested it under FC4 - 64bits - AMD64
[28 Jul 2005 20:08] Serhiy Polyakov

When you say "case" do you mean send you the table I could not index or more detailed description of the error?

Also, I can send you the structure of the table and first 10-20 records, you may see the nature of data, the rest are very similar.

BTW I reloaded MyISAM tables to InnoDB and indexing works OK.

[31 Jul 2005 21:07] Jorge del Conde

By 'test-case' I meant if you could give us something such as files, set of instructions, or anything that might help us reproduce this bug.

Giving us the table with the first few records  would definitely be a great start :)

[2 Aug 2005 4:00] Serhiy Polyakov

I attached (as a file) two SQL statements to create the table and insert records. I have 75 records in this example but real number in my table is 1,062,844,989. Spaces, including trailing, have a meaning in my database that is why I am using varchar. My table size is 21.5 GB. 

Problem description

I need to add index on all columns. I tried three different ways. Eventually I found one way around this problem (way 3 below). First and second give me an error.

(Way 1)
I start indexing with one smallest column. When I run:
ALTER TABLE Control_008 ADD INDEX (PositionGroupCounter);
After some 15-20 hours or so:
ERROR 1034 (HY000): Number of rows changed from 1006632866 to 1062844989.

I could see (before error happened) that MySQL creates complete copies of all table files to temporary files and starts building index file against all data set stored in temporary file.

(Way 2)
I create table of the same structure with indexes and insert data from non-indexed table to this new table:

CREATE TABLE `Control_008_Indexed` (
  ControlNumber varchar(20), INDEX(ControlNumber),
  PositionGroupCounter tinyint, INDEX(PositionGroupCounter),
  PositionDataValue varchar(20), INDEX(PositionDataValue)

INSERT INTO `Control_008_Indexed` SELECT * from `Control_008`;

The same error...

I could see that MySQL creates new Control_008_Indexed.MYD (size = size of original) file and only than starts building MYI file. Also it builds TMD file in parallel with MYI file.

(Way 3)
Create table of the same structure with indexes and inset data from non-indexed table part by part in three chunks:

CREATE TABLE `Control_008_Indexed` (
  ControlNumber varchar(20), INDEX(ControlNumber),
  PositionGroupCounter tinyint, INDEX(PositionGroupCounter),
  PositionDataValue varchar(20), INDEX(PositionDataValue)

INSERT INTO `Control_008_Indexed ` SELECT * from `Control_008 ` LIMIT 0,360000000;
INSERT INTO `Control_008_Indexed ` SELECT * from `Control_008 ` LIMIT 360000000,360000000;
INSERT INTO `Control_008_Indexed ` SELECT * from `Control_008 ` LIMIT 720000000,360000000;

I can see that MySQL creates new Control_008_Indexed.MYD (size = 1/3 size of original) file, than starts building MYI file. Then again it adds next records chunk to Control_008_Indexed.MYD and continue to build index.

However when I tried to inset data in two chunks (number of record in each is ~500,000,000 instead of 360,000,000) *.TMD file have been created. I stopped this process because I am sure error will happen.

So, I think this error is happening when MySQL tries to index > 360,000,000 records at once (it is about 7-8GB of data).

[2 Aug 2005 4:01] Serhiy Polyakov

Attachment: BigTable.sql (text/x-sql), 2.30 KiB.

[29 Aug 2005 9:53] Sergei Golubchik
Is there anything in the error log ?
[31 Aug 2005 17:42] Serhiy Polyakov
I know only one error log 
It does not contain any errors (I guess this is not error log I should look for)

I found yet another way around the problem. Split big tables to smaller ones (not more than 7GB each, index them, use MERGE engine to create “union” tables.

[25 Sep 2005 9:03] Valerii Kravchuk

I have several questions to you before even try to repeat the problem you described (looks like it will take many hours of CPU and disk resources):

1. You experienced this problem with table larger that 7Gb, right? So, I assume 7-8 Gb will be enough for the verification purposes.

2. Is it a table size- or number of rows-related problem, from your point of view?

3. How your column data are distributed? Are they unique or how many uniqie values do you really have? I need this information to be able to create a reasonable data for the test.

4. Have you tried to use newer version of MySQL? 5.0.12-beta or 5.0.14-rc? I will test on the latest versions only.

5. Please, send the results of df -k command (I wonder if you have anough space for creating temporary files for such a big indexes creation).

6. Please, send the content of your my.cnf file, just to be sure I'll use the same key server variables values.

7. What are the results of ulimit -a command performed by the user running mysqld?

And yes, <hostname>.log in the data directory is the error log you were asked about. So, do you have anything unusual in this file? You may just send a part of it for the period when you tried to create indexes.
[30 Sep 2005 14:00] Willem van Pelt
On our server (MySQL 5.0.13-rc-Max-log, installed from RPM), we were experiencing the same problem using much smaller datasets.

When adding an index to a table containing 13,592,647 records (table size 480 MB), I got the following error:

#1034 - Number of rows changed from 11842702 to 13592647

Also, when inserting a large number of records into a table that already contained an index, I got this error.

Since reproducing the problem took just 60 seconds, I was able to experiment with it. My observations:
1) After starting the 'add index' command, the disk space on /serverdata (where our mysql data are stored) gradually decreases.
2) Just a couple of seconds before the error occurs, the available space in /tmp decreases heavily.

My guess was that somewhere at the end of the process, the temporary index file is copied from /serverdata to /tmp (this can only be seen using df, since the file does not appear in the directory listing).

After increasing the size of /tmp, I was able to create an index. The new index had a size of 122 MB. When the error occurred, the free space on /tmp was just 100 MB.

As a result of this experiment, I have set the tmpdir in my.cnf to a tmp directory in /serverdata. This also significantly improved the time needed to index the table (from ~60 to ~30 seconds).
[1 Oct 2005 3:50] Serhiy Polyakov

Recently I realized that probably it was the problem with space in tmp directory. I have enough space on mysql data directory but tmp is on another drive and I had only 10GB free there (and I had problems with tables >10GB).

After Willem van Pelt posted the message I am almost sure about that. I have seen that temp file was built in mysql data dir but could not catch that MySQL copies that file to tmp dir at the end.

So, Valeriy, I am not answering all your questions right now. I will try indexing with more space in temp and inform if problem was fixed.
Thanks Valeriy and Willem van Pelt.

[1 Oct 2005 5:41] Valerii Kravchuk
Nice to know that my questions helped you to understand possible reason of your problem. Please, inform me about the results of your experiments using the adequate temporary space.

I'll mark this report as "Need Feedback" while waiting for your results.
[2 Oct 2005 2:11] Serhiy Polyakov

I was able to index table of 13.5GB after I increased space in /tmp (it was ~10GB, now it is ~40G). Previously I had error like:

ERROR 1034 (HY000): Number of rows changed from NNNNNNNN to NNNNNNNN.

So, error was caused by lack of tmp disk space. It has to be larger  than table that needs to be indexed. In my case /tmp is on different drive than mysql data dir. And of course, mysql data dir space has to be large enough to store that temp file at the beginning plus index.

Thanks for your help,
[2 Oct 2005 8:49] Valerii Kravchuk
Great! So, can I mark this report as "Not a bug" now?
[2 Oct 2005 19:34] Serhiy Polyakov
Yes, this can be marked as "no bug".


Of course, it would be good if error message informs about luck of space in tmp instead of what is says. However, I guess this is different story and separate report should be opened for this kind of problem (if any).
[6 Oct 2005 16:53] Sergei Golubchik
I agree, it there's not enough space MySQL should say so
[9 Oct 2005 14:15] Valerii Kravchuk
I was able to repeat the situation described in the original report with a simple scenario.

1. I left small free space in /tmp (see df -k results later):

[openxs@Fedora openxs]$ df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda2              2110968   2001828      1908 100% /
/dev/hda3              2079312   1888860     84828  96% /home
none                    111560         0    111560   0% /dev/shm

Not more than 1908 K in /tmp (and in /). Note, that I have some space (enough) in /home, where my data directory is localted. (If there is lack of space in it too, the other story begins - looks like infinit "Repair with keykache" state for process, but it is a separate issue.)

2. Then I started server and make it use /tmp as temporary directory:

[openxs@Fedora openxs]$ cd dbs/5.0/
[openxs@Fedora 5.0]$ bin/mysqld_safe --tmpdir=/tmp &
[1] 1364
[openxs@Fedora 5.0]$ Starting mysqld daemon with databases from /home/openxs/dbs/5.0/var

[openxs@Fedora 5.0]$ bin/mysql -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 5.0.15-rc

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table large(c1 int auto_increment primary key, c2 char(100));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into large(c2) values('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbccccccccccccccccccdddddddddddddddddee');
Query OK, 1 row affected (0.00 sec)

mysql> insert into large(c2) select c2 from large;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> insert into large(c2) select c2 from large;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0


mysql> insert into large(c2) select c2 from large;
Query OK, 16384 rows affected (0.78 sec)
Records: 16384  Duplicates: 0  Warnings: 0

mysql> update large set c2=concat(c1,c2);
Query OK, 32768 rows affected (1.24 sec)
Rows matched: 32768  Changed: 32768  Warnings: 0

mysql> insert into large(c2) select c2 from large;
Query OK, 32768 rows affected (1.62 sec)
Records: 32768  Duplicates: 0  Warnings: 0

mysql> insert into large(c2) select c2 from large;
Query OK, 65536 rows affected (4.21 sec)
Records: 65536  Duplicates: 0  Warnings: 0

mysql> alter table large add index (c2);
ERROR 1034 (HY000): Number of rows changed from 74897 to 131072

So, that is it. This error message is a bug by itself. No messages about not enough free space etc. 

I've used the ChangeSet@1.2019.1.1, 2005-10-06 22:09:15+03:00, jani@ua141d10.elisa.omakaista.fi on Fedora Core 1:

[openxs@Fedora 5.0]$ uname -a
Linux Fedora 2.4.22-1.2115.nptl #1 Wed Oct 29 15:42:51 EST 2003 i686 i686 i386 GNU/Linux
[20 Oct 2005 8:19] Ingo Strüwing
Shell script for repeating the error.

Attachment: bug12166-1.sh (application/x-sh, text), 4.11 KiB.

[20 Oct 2005 9:39] Ingo Strüwing
It seems like our mail server has problems currently. So I set the state manually. The outstanding commit mail will probably be:
bk commit - 5.0 tree (ingo:1.2048) BUG#12166, Date: Thu, 20 Oct 2005 10:31:19 +0200
[26 Oct 2005 12:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

[27 Oct 2005 8:03] Ingo Strüwing
The error message in question was wrongly emitted. The add index operation was successful, but the error code resulted in dropping the freshly created table and leaving the old table (which did not have the index) in place. The fix suppresses the error message so that everything is successful now. The only problem in this situation is that adding the index is tried twice. The first try uses a temporary sort file, which hits the disk limit and thus fails. The second try uses a slower but safer method (without a temporary file) and succeeds. The error messages of the first try don't show up as errors any more, but can still be seen with SHOW WARNINGS:

alter table t1 add index (c2)

show warnings

Level   Code    Message
Error   3       Error writing file '/mnt/tmpfs1/STJVGNj7' (Errcode: 28)
Error   1034    28 when fixing table
Error   1034    Number of rows changed from 74897 to 262144

show create table t1

Table   Create Table
t1      CREATE TABLE `t1` (\n  `c1` int(11) NOT NULL auto_increment,\n
`c2` char(100) default NULL,\n  PRIMARY KEY  (`c1`),\n  KEY `c2`
[31 Oct 2005 11:40] Ingo Strüwing
Pushed to 5.0.16.
[31 Oct 2005 19:08] Paul Dubois
Noted in 5.0.16 changelog.