Bug #32349 possible incorrect work with MyISAM indexes in mysqld repair (5.1.21)
Submitted: 13 Nov 2007 23:26 Modified: 4 May 2008 8:34
Reporter: Serge Yakubovich Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S2 (Serious)
Version:5.1.21,5.1.22rc, 5.1.23-rc OS:Linux
Assigned to: CPU Architecture:Any

[13 Nov 2007 23:26] Serge Yakubovich
Description:
Hi,
found some strange situation with indexes - their size and speed

I have a series of tables like this:

mysql> show create table CS_0000;

| CS_0000 | CREATE TABLE `CS_0000` (
  `CHKSUM` binary(16) NOT NULL,
  KEY `KEY_CHKSUM` (`CHKSUM`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

Each have exactly 50,000,000 rows, CHKSUM is binary MD5 sum of some data used to
eliminate duplicates, i.e is unique
First tables data and index files look like:

-rw-rw---- 1 mysql mysql  850000000 Oct 18 21:59 /G1/mysql5121/SEARCH/CS_0000.MYD
-rw-rw---- 1 mysql mysql 1146311680 Oct 18 22:00 /G1/mysql5121/SEARCH/CS_0000.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 18 21:40 /G1/mysql5121/SEARCH/CS_0001.MYD
-rw-rw---- 1 mysql mysql 1146323968 Oct 18 21:43 /G1/mysql5121/SEARCH/CS_0001.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 18 23:42 /G1/mysql5121/SEARCH/CS_0002.MYD
-rw-rw---- 1 mysql mysql 1146326016 Oct 18 23:43 /G1/mysql5121/SEARCH/CS_0002.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 18 23:45 /G1/mysql5121/SEARCH/CS_0003.MYD
-rw-rw---- 1 mysql mysql 1146331136 Oct 18 23:46 /G1/mysql5121/SEARCH/CS_0003.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 18 23:47 /G1/mysql5121/SEARCH/CS_0004.MYD
-rw-rw---- 1 mysql mysql 1146319872 Oct 18 23:47 /G1/mysql5121/SEARCH/CS_0004.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 22 20:18 /G1/mysql5121/SEARCH/CS_0005.MYD
-rw-rw---- 1 mysql mysql 1146319872 Oct 22 20:18 /G1/mysql5121/SEARCH/CS_0005.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 22 20:39 /G1/mysql5121/SEARCH/CS_0006.MYD
-rw-rw---- 1 mysql mysql 1146314752 Oct 22 20:39 /G1/mysql5121/SEARCH/CS_0006.MYI
-rw-rw---- 1 mysql mysql  850000000 Oct 22 20:53 /G1/mysql5121/SEARCH/CS_0007.MYD
-rw-rw---- 1 mysql mysql 1146330112 Oct 22 20:54 /G1/mysql5121/SEARCH/CS_0007.MYI
...
BUT: from some moment of time, last created tables looks different:

-rw-rw---- 1 mysql mysql  850000000 Nov  4 02:06 /G1/mysql5121/SEARCH/CS_0015.MYD
-rw-rw---- 1 mysql mysql 1116192768 Nov  4 02:25 /G1/mysql5121/SEARCH/CS_0015.MYI
-rw-rw---- 1 mysql mysql  850000000 Nov  8 17:59 /G1/mysql5121/SEARCH/CS_0016.MYD
-rw-rw---- 1 mysql mysql 1116175360 Nov  8 18:05 /G1/mysql5121/SEARCH/CS_0016.MYI

Note relatively big difference in index sizes - about 30M

At the same time, there is significant difference in speed of queries using theese indexes,
for example:

mysql> create temporary table C like CS_0000;
Query OK, 0 rows affected (0.00 sec)

mysql> insert C select CHKSUM from CS_0000 order by CHKSUM limit 100000;
Query OK, 100000 rows affected (0.46 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql> delete C from C,CS_0000 where CS_0000.CHKSUM=C.CHKSUM;
Query OK, 100000 rows affected (1.80 sec)

mysql> insert C select CHKSUM from CS_0016 order by CHKSUM limit 100000;
Query OK, 100000 rows affected (0.62 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql> delete C from C,CS_0016 where CS_0016.CHKSUM=C.CHKSUM;
Query OK, 100000 rows affected (4 min 43.01 sec)

Time of delete queries differ in 260 times in this sample (no other load at the moment)
SHOW INDEX show no difference:
mysql> show index from CS_0000;
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| CS_0000 |          1 | KEY_CHKSUM |            1 | CHKSUM      | A         |        NULL |     NULL | NULL   |      | BTREE      | NULL    |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.03 sec)

mysql> show index from CS_0016;
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| CS_0016 |          1 | KEY_CHKSUM |            1 | CHKSUM      | A         |        NULL |     NULL | NULL   |      | BTREE      | NULL    |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.00 sec)

Any REPAIR / OPTIMIZE / ALTER TABLE recreations of "short" indexes from mysqld gives no effect,
i.e gives the same results (all of them show 'repair whith 1 thread' status in SHOW PROCESSES

But, after myisamchk:
# /usr/local/mysql5121/bin/myisamchk -r -q /G1/mysql5121/SEARCH/CS_0016.MYI
- check record delete-chain
- recovering (with sort) MyISAM-table '/G1/mysql5121/SEARCH/CS_0016.MYI'
Data records: 50000000
- Fixing index 1

query speed improves to normal value, although index size does not change:

mysql> create temporary table C like CS_0000;
Query OK, 0 rows affected (0.00 sec)

mysql> insert C select CHKSUM from CS_0016 order by CHKSUM limit 100000;
Query OK, 100000 rows affected (0.41 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql> delete C from C,CS_0016 where CS_0016.CHKSUM=C.CHKSUM;
Query OK, 100000 rows affected (1.30 sec)

And, with again REPAIR TABLE we return again to slow query

I can't understand what caused such a change in mysqld behavior
Sytem was running yum_updated, i.e there were possible some updates on system components, like libs - but mysqld was bilt statically:

# file /usr/local/mysql5121/bin/mysqld
/usr/local/mysql5121/bin/mysqld: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, statically linked, for GNU/Linux 2.6.9, not stripped

And recompiling of mysqld give no results also
Should try 5.1.22-rc tomorrow

How to repeat:
Here is a litlle PHP script to reproduce a table with similar structure and data,
I get similar results as above with it
-----------------------------------------
#!/usr/bin/php
<?
define('SQLHOST',       'localhost:/var/lib/mysql5121/mysql5121.sock');
define('SQLUSER',       'root');
define('SQLPASS',       '');
define('SQLDB',         'test');

function sql_conn($h=SQLHOST,$u=SQLUSER,$p=SQLPASS,$d=SQLDB) {
        $SQLLINK=mysql_pconnect($h,$u,$p)       or die('Error connecting to MySQL');
        mysql_select_db($d,$SQLLINK)            or die('Error selecting MySQL DB: '.mysql_error());
}
function q($Q) { return mysql_query($Q); }

ini_set('memory_limit', '512M');
set_time_limit(0);
error_reporting(E_ALL);
sql_conn();

$TMPI='CS_TEST';
q("drop table if exists $TMPI");
q("create table $TMPI (CHKSUM binary(16), key KEY_CHKSUM (CHKSUM))");
q("alter table $TMPI disable keys");
for ($n=0; $n<1000; $n++) {
        for ($s='', $i=0; $i<50000; $i++) $s.='(0x'.md5($n*50000+$i).'),';
        q("insert $TMPI values".rtrim($s,','));
}
q("alter table $TMPI enable keys");
?>
----------------------------------------------

some system/install info:

>Release:       mysql-5.1.21-beta-max (Source distribution)

>C compiler:    gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
>C++ compiler:  gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
>Environment:
        <machine, os, target, libraries (multiple lines)>
System: Linux bandura 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
Architecture: x86_64

Some paths:  /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc
GCC: Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.1 20070105 (Red Hat 4.1.1-52)
Compilation info: CC='gcc'  CFLAGS='-O3'  CXX='gcc'  CXXFLAGS='-O3 -felide-constructors -fno-exceptions -fno-rtti'  LDFLAGS=''  ASFLAGS=''
LIBC:
lrwxrwxrwx 1 root root 11 Jul  4 20:31 /lib/libc.so.6 -> libc-2.5.so
-rwxr-xr-x 1 root root 1576952 Mar 14  2007 /lib/libc-2.5.so
Configure command: ./configure '--with-mysqld-ldflags=-all-static' 'prefix=/usr/local/mysql5121' '--localstatedir=/usr/local/mysql5121/data' '--libexecdir=/usr/local/mysql5121/bin' '--with-extra-charsets=complex' '--with-server-suffix=-max' '--enable-thread-safe-client' '--enable-local-infile' '--enable-assembler' '--with-berkeley-db' '--with-archive-storage-engine' '--with-blackhole-storage-engine' '--with-csv-storage-engine' '--with-example-storage-engine' '--with-federated-storage-engine' '--with-innodb' 'CC=gcc' 'CFLAGS=-O3' 'CXXFLAGS=-O3 -felide-constructors -fno-exceptions -fno-rtti' 'CXX=gcc'

Suggested fix:
myisamchk -r ... fixes queries speed
[25 Nov 2007 17:18] Valeriy Kravchuk
Thank you for a problem report. Please, inform about your results with 5.1.22-rc.
[26 Dec 2007 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[6 Feb 2008 0:34] Serge Yakubovich
Hi,

sorry for long dalay, but coming back to the problem.
Have installed (from sources) 5.1.22-rc
Same results.

Below is my test. I create tables M (100000 recs) as a part of large table MESSAGES_0000, and CS (50000000 recs) as all MD5 control sums from the same MESSAGES_0000. I'm testing join delete of CHKSUM duplicates in M against CS. First test - with newly created CS - takes 7 min 12.48 sec
Second - after myisamchk -r CS - takes only 7.72 sec, normal value as for me :)
Note - in both cases I've preloaded both M and CS indexes in memory ( having 32G RAM I'm sure they fit :) - and one CPU was 100% loaded by delete thread both times, i.e no disk IO practically occured

[root@bandura LOAD]# ../DB.sh
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 5.1.22-rc-max-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table M like MESSAGES_0000;
Query OK, 0 rows affected (0.03 sec)

mysql> insert M select * from MESSAGES_0000 limit 100000;
Query OK, 100000 rows affected (4.83 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql> create table CS like CS_0000;
Query OK, 0 rows affected (0.02 sec)

mysql> insert CS select CHKSUM from MESSAGES_0000 order by CHKSUM;
Query OK, 50000000 rows affected (7 min 59.72 sec)
Records: 50000000  Duplicates: 0  Warnings: 0

mysql> load index into cache M,CS;
+-----------+--------------+----------+----------+
| Table     | Op           | Msg_type | Msg_text |
+-----------+--------------+----------+----------+
| SEARCH.M  | preload_keys | status   | OK       |
| SEARCH.CS | preload_keys | status   | OK       |
+-----------+--------------+----------+----------+
2 rows in set (4.74 sec)

mysql> show create table M;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                            |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| M     | CREATE TABLE `M` (
  `MSGID` bigint(20) NOT NULL AUTO_INCREMENT,
  `CHKSUM` binary(16) NOT NULL,
  `SRCID` binary(16) NOT NULL,
  `INSDATE` int(11) DEFAULT NULL,
  `TYPE` smallint(6) DEFAULT NULL,
  `MSGBODY` mediumtext,
  PRIMARY KEY (`MSGID`),
  UNIQUE KEY `KEY_CHKSUM` (`CHKSUM`),
  KEY `KEY_SRCID` (`SRCID`)
) ENGINE=MyISAM AUTO_INCREMENT=100001 DEFAULT CHARSET=latin1 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table CS;
+-------+--------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                             |
+-------+--------------------------------------------------------------------------------------------------------------------------+
| CS    | CREATE TABLE `CS` (
  `CHKSUM` binary(16) NOT NULL,
  KEY `KEY_CHKSUM` (`CHKSUM`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(*) from M;
+----------+
| count(*) |
+----------+
|   100000 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from CS;
+----------+
| count(*) |
+----------+
| 50000000 |
+----------+
1 row in set (0.00 sec)

mysql> delete M from M,CS where CS.CHKSUM=M.CHKSUM;
Query OK, 100000 rows affected (7 min 12.48 sec)

mysql> flush table CS;
Query OK, 0 rows affected (0.72 sec)

mysql> Bye
[root@bandura LOAD]# /usr/local/mysql5122/bin/myisamchk -r /var/lib/mysql5122/SEARCH/CS.MYI
- recovering (with sort) MyISAM-table '/var/lib/mysql5122/SEARCH/CS.MYI'
Data records: 50000000
- Fixing index 1
[root@bandura LOAD]# ../DB.sh
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4 to server version: 5.1.22-rc-max-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> truncate table M;
Query OK, 0 rows affected (0.12 sec)

mysql> insert M select * from MESSAGES_0000 limit 100000;
Query OK, 100000 rows affected (4.56 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql> load index into cache M,CS;
+-----------+--------------+----------+----------+
| Table     | Op           | Msg_type | Msg_text |
+-----------+--------------+----------+----------+
| SEARCH.M  | preload_keys | status   | OK       |
| SEARCH.CS | preload_keys | status   | OK       |
+-----------+--------------+----------+----------+
2 rows in set (3.55 sec)

mysql> delete M from M,CS where CS.CHKSUM=M.CHKSUM;
Query OK, 100000 rows affected (7.72 sec)

And, yet another thing I encountered while testing around.

In mysql 5.1.21:

mysql> create table M ( CHKSUM binary(16), SRCID binary(16), INSDATE int(11), TYPE smallint, MSGBODY mediumtext, unique key KEY_CHKSUM (CHKSUM));
Query OK, 0 rows affected (0.00 sec)

mysql> insert M select CHKSUM,SRCID,INSDATE,TYPE,MSGBODY from MESSAGES_0019 limit 5000000;
Query OK, 5000000 rows affected (3 min 30.37 sec)
Records: 5000000  Duplicates: 0  Warnings: 0

mysql> delete M from M, MESSAGES_0019 where MESSAGES_0019.CHKSUM=M.CHKSUM;
Query OK, 4834957 rows affected (3 min 39.13 sec)

          ^^^^^^^^^^^^!!!!!!!!!

Must be as I suppose - 5000000. Actually, all 5000000 recs was deleted, so just wrong affected rows were reported:

mysql> select count(*) from M;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

But, in 5.1.22-rc this works correct:

mysql> truncate table M;
Query OK, 0 rows affected (7.38 sec)

mysql> insert M select CHKSUM,SRCID,INSDATE,TYPE,MSGBODY from MESSAGES_0019 limit 5000000;
Query OK, 5000000 rows affected (3 min 0.28 sec)
Records: 5000000  Duplicates: 0  Warnings: 0

mysql> delete M from M, MESSAGES_0019 where MESSAGES_0019.CHKSUM=M.CHKSUM;
Query OK, 5000000 rows affected (5 min 43.18 sec)

Can't you pls tell me - was this bug(?) reported and fixed in 5.1.22, so we can relay on affected rows ?
I can't find it in bugs list

WBRG, Serge
[14 Feb 2008 0:56] Serge Yakubovich
Hi again!

Some additional information ...

I've compared index files after SQL level "REPAIR TABLE CS_TEST QUICK"
(CS_TEST.MYI_BAD) and "myisamchk -r -q..." (CS_TEST.MYI)
Quite interesting. Differences are in few bytes for 1.1Gb index files...
Here they are

------------------

[root@bandura LOAD]# cmp -l /F1/mysql5121/test/CS_TEST.MYI CS_TEST.MYI_BAD
        27  50  40
       111 103 165
       112  10  12
       116   4 125
       183 214 210
       184  57 166
       193   0   2
       194   0 372
       195   0 360
       196   0 200
[root@bandura LOAD]#

------------------

For reference, "cmp -l" means "Print the byte number (decimal)
and the differing byte values (octal) for each difference."

It is for server version 5.1.22rc
[14 Feb 2008 1:18] Serge Yakubovich
And, forget to mention:

results of .MYI files comparison are produced after the following php script
( it slightly differ from previous in that I changed "binary(16)" type to "char(16) binary" and added replacement of '00' in md5 sum to '22'
to avid binary zero bytes in field ). Results of tests are the same - after "REPAIR TABLE..." join DELETE goes minutes ( at least >8, I was not waiting more ) while after myisamchk -r ... - several seconds (like 4.5s)

----------------
#!/usr/bin/php
<?
define('SQLHOST',       'localhost:/var/lib/mysql5121/mysql5122.sock');
define('SQLUSER',       'root');
define('SQLPASS',       'RootPasswd');
define('SQLDB',         'test');

function sql_conn($h=SQLHOST,$u=SQLUSER,$p=SQLPASS,$d=SQLDB) {
        $SQLLINK=mysql_pconnect($h,$u,$p)       or die('Error connecting to MySQL');
        mysql_select_db($d,$SQLLINK)            or die('Error selecting MySQL DB:
'.mysql_error());
}
function q($Q) { return mysql_query($Q); }

ini_set('memory_limit', '512M');
set_time_limit(0);
error_reporting(E_ALL);
sql_conn();

$TMPI='CS_TEST';
q("drop table if exists $TMPI");
q("create table $TMPI (CHKSUM char(16) binary, key KEY_CHKSUM (CHKSUM))");
q("alter table $TMPI disable keys");
for ($m=$n=0; $n<500; $n++) {
        for ($s='', $i=0; $i<100000; $i++) $s.='(0x'.md5($m++).'),';
        q("insert $TMPI values".strtr(rtrim($s,','),array('00'=>'22')));
}
q("alter table $TMPI enable keys");
?>
-------------------

Waiting to hear somthing from you :)

It's important for me as I'm developing/supporting some kind o search system with currently 1.1 gigarecords ( >1.6Tb ) in database (1.6Tb)

WBRG, Serge
[14 Feb 2008 2:18] Serge Yakubovich
Sorry, when getting 'cmp -l' results, I did not issue FLUSH TABLE
after REPAIR TABLE... so reslts are somewhat incorrect.
Now correct ones ( after even stopping server to be sure)

--------------------
[root@bandura LOAD]# cmp -l /F1/mysql5121/test/CS_TEST.MYI CS_TEST.MYI_BAD
        27  50  40
       111 106 165
       112 253  12
       116   4 124
       183 242 236
       184 135 353
       193   0   2
       194   0 372
       195   0 360
       196   0 200
[root@bandura LOAD]#
--------------------
[19 Feb 2008 21:32] Serge Yakubovich
Hi,

IS ANYBODY HERE ? ;)

The same situation is with 5.1.23rc...
[3 Mar 2008 13:29] Susanne Ebrecht
How did you upgrade from older versions of MySQL 5.1 to MySQL 5.1.23-rc?
Did you do the steps of:
http://dev.mysql.com/doc/refman/5.1/en/upgrading-from-5-0.html

You have to do this by upgrading from older 5.1 versions too because 5.1 is still not a stable version and lots changed between earlier versions and today version.
[3 Apr 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[4 Apr 2008 8:34] Susanne Ebrecht
Serge,

we still need to know if all work fine after you did the update in the right way.
[4 May 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".