Bug #1253 Mysqld crash/lockup
Submitted: 11 Sep 2003 11:18 Modified: 18 Sep 2003 10:33
Reporter: Dave Dyer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S1 (Critical)
Version:4.0.14 OS:Windows (windows 2000)
Assigned to: CPU Architecture:Any

[11 Sep 2003 11:18] Dave Dyer
Description:
 OS is Win2k, Dual Processor

 Mysqld encounters an assertion failure, entry in the server log is

Assertion failed: pos != (~(my_off_t) 0), file c:\build\build\mysql-4.0.14\mysys\my_seek.c, line 31

 It is likely that multiple processes are simultaneously updating
the same records of this table.  After the failure, the client
processes are hung (presumably waiting for response from mysqld)

 The table apparently being updated has this definition.
CREATE TABLE textentry (
  number char(32) NOT NULL default '',
  pagenum int not null,
  textidx int not null,
  texttype enum('Normal','Portrait') NOT NULL default 'Normal',
  textstring text not null default '',
  textlft int not null,
  texttop int not null,
  textrgt int not null,
  textbot int not null,
  pointsize int not null,
  alignment enum('Left','Center','Right','Justified') NOT NULL default 'Left',
  attributes enum('Regular','Bold','Italic','Bold-Italic') NOT NULL default 'Regular',
  previewed enum('No','Yes') NOT NULL default 'No',
  replaced enum('No','Yes') NOT NULL default 'No',
  textcolor char(6) not null default '000000',		# hex rrggbb	
  font char(50) not null default 'Times',
  timecode tinytext,				# to append to html url's for browser cacheing
  subindex int,						# used by batch flow
  uid int,							# for portraits - click on a caption and get the picture table 
  sync_date datetime default '0000-00-00 00:00:00',
  changed timestamp(14) NOT NULL,
  KEY number (number)
) TYPE=MyISAM;

How to repeat:

No formula, but has happened 3 times that we know of.
[11 Sep 2003 13:35] MySQL Verification Team
Not enough information was provided for us to be able
to handle this bug. Please re-read the instructions at
http://bugs.mysql.com/how-to-report.php

If you can provide more information, feel free to add it
to this bug and change the status back to 'Open'.

Thank you for your interest in MySQL.

Need a test case.
[11 Sep 2003 14:04] Dave Dyer
We're trying to develop more information, but it would help us if
you provided any background that is available about this problem.

Should I deduce from your generalized response that this is not a known 
problem? (ie; the assertion in my_seek has never failed before?)
[11 Sep 2003 15:29] MySQL Verification Team
No I don't remember about similar case reported.
However even the same code is showed as the source of the assertion
failure there are chance for to have another reason for. So for your
specific case we need a test case, that we can reproduce.
[12 Sep 2003 5:25] Sergei Golubchik
do you have RAID feature compiled in ? (show variables like 'have_raid')

did you check your tables for corruption ?

what was the query that triggered the assertion ?
[12 Sep 2003 9:27] Dave Dyer
HAVE_RAID has value NO, but the systems in question do have RAID disks.

There is some evidence of corruption, but I think that is a sympom
rather than a cause.  
mysql> check table textentry;
+--------------------+-------+----------+---------------------------------------------------+
| Table              | Op    | Msg_type | Msg_text                                          |
+--------------------+-------+----------+---------------------------------------------------+
| moon2003.textentry | check | warning  | Size of indexfile is: 53248      Should be: 47104 |
| moon2003.textentry | check | status   | OK                                                |
+--------------------+-------+----------+---------------------------------------------------+
2 rows in set (0.11 sec)

mysql> repair table textentry;
+--------------------+--------+----------+----------+
| Table              | Op     | Msg_type | Msg_text |
+--------------------+--------+----------+----------+
| moon2003.textentry | repair | status   | OK       |
+--------------------+--------+----------+----------+
1 row in set (0.16 sec)

mysql> check table textentry;
+--------------------+-------+----------+---------------------------------------------------+
| Table              | Op    | Msg_type | Msg_text                                          |
+--------------------+-------+----------+---------------------------------------------------+
| moon2003.textentry | check | warning  | Size of indexfile is: 53248      Should be: 35840 |
| moon2003.textentry | check | status   | OK                                                |
+--------------------+-------+----------+---------------------------------------------------+
2 rows in set (0.09 sec)

mysql>

It's hard to tell specifically what triggered the crash (I don't
know where it would be logged) but the evidence is that the final
queries were a DELETE followed by several INSERTs that replaced
the deleted records.
[12 Sep 2003 10:51] Dave Dyer
Just a thought.  The assert that is failing is in my_seek, complaning
that the requested address is the "seek error" value.  There is no 
corresponding assert in my_get to see if that value is ever returned.

Without studying the rest of the code, it seems reasonable to me that
the most likely source for the value supplied to my_seek is my_get,
and if that is the case, perhaps the problem is a bug in the device
driver, or an inappropriate use of my_get
[15 Sep 2003 16:19] Dave Dyer
We've now seen a total of 4 instances of this problem on two different
systems which are essentially identical.   I've set up binary logging
on these systems so next time (ha - it sez here) we can try to reporduce
the problem by replaying the log.

One additional datum related to this problem is that indeces seem to be
corrupted after the crash, and using "repair tables" doesn't fully recover.

+------------------+-------+----------+----------+
| moon2003.picture | check | status   | OK       |
+------------------+-------+----------+----------+
1 row in set (0.11 sec)

mysql> check table placedimage;
+----------------------+-------+----------+-----------------------------------------------------+
| Table                | Op    | Msg_type | Msg_text                                            |
+----------------------+-------+----------+-----------------------------------------------------+
| moon2003.placedimage | check | warning  | Size of indexfile is: 237568      Should be: 236544 |
| moon2003.placedimage | check | status   | OK                                                  |
+----------------------+-------+----------+-----------------------------------------------------+
2 rows in set (0.11 sec)

mysql> repair table placedimage;
+----------------------+--------+----------+----------+
| Table                | Op     | Msg_type | Msg_text |
+----------------------+--------+----------+----------+
| moon2003.placedimage | repair | status   | OK       |
+----------------------+--------+----------+----------+
1 row in set (0.24 sec)

mysql> check table placedimage;
+----------------------+-------+----------+-----------------------------------------------------+
| Table                | Op    | Msg_type | Msg_text                                            |
+----------------------+-------+----------+-----------------------------------------------------+
| moon2003.placedimage | check | warning  | Size of indexfile is: 237568      Should be: 178176 |
| moon2003.placedimage | check | status   | OK                                                  |
+----------------------+-------+----------+-----------------------------------------------------+
2 rows in set (0.11 sec)

mysql> drop index number on placedimage;
Query OK, 13137 rows affected (0.56 sec)
Records: 13137  Duplicates: 0  Warnings: 0

mysql> check table placedimage;
+----------------------+-------+----------+----------+
| Table                | Op    | Msg_type | Msg_text |
+----------------------+-------+----------+----------+
| moon2003.placedimage | check | status   | OK       |
+----------------------+-------+----------+----------+
1 row in set (0.09 sec)

mysql> create index number on placedimage(number);
Query OK, 13137 rows affected (0.64 sec)
Records: 13137  Duplicates: 0  Warnings: 0

mysql> check table placedimage;
+----------------------+-------+----------+----------+
| Table                | Op    | Msg_type | Msg_text |
+----------------------+-------+----------+----------+
| moon2003.placedimage | check | status   | OK       |
+----------------------+-------+----------+----------+
1 row in set (0.13 sec)

mysql>
[16 Sep 2003 14:33] Dave Dyer
OK!! I've got a reproducable test case that starts with a clean
database, and ends with multiple corruption.  The amount of data
involved is only a few megabytes.

Can I send this data somewhere for your attention??

F:\mysql>mysql -u root moon2003
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 4.0.14-max-debug

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> check table textentry;
+--------------------+-------+----------+----------+
| Table              | Op    | Msg_type | Msg_text |
+--------------------+-------+----------+----------+
| moon2003.textentry | check | status   | OK       |
+--------------------+-------+----------+----------+
1 row in set (0.08 sec)

mysql> check table bkgdelem;
+-------------------+-------+----------+----------+
| Table             | Op    | Msg_type | Msg_text |
+-------------------+-------+----------+----------+
| moon2003.bkgdelem | check | status   | OK       |
+-------------------+-------+----------+----------+
1 row in set (0.01 sec)

mysql> check table placedimage;
+----------------------+-------+----------+----------+
| Table                | Op    | Msg_type | Msg_text |
+----------------------+-------+----------+----------+
| moon2003.placedimage | check | status   | OK       |
+----------------------+-------+----------+----------+
1 row in set (0.06 sec)

mysql> \q
Bye

F:\mysql>mysql -u root < moonshot3-update.txt
ERROR 1030 at line 109542: Got error 127 from table handler

F:\mysql>mysql -u root moon2003
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3 to server version: 4.0.14-max-debug

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> check table placedimage;
+----------------------+-------+----------+--------------------------------------------+
| Table                | Op    | Msg_type | Msg_text                                   |
+----------------------+-------+----------+--------------------------------------------+
| moon2003.placedimage | check | warning  | Table is marked as crashed                 |
| moon2003.placedimage | check | error    | Record at pos: 459720 is not remove-marked |
| moon2003.placedimage | check | error    | record delete-link-chain corrupted         |
| moon2003.placedimage | check | error    | Corrupt                                    |
+----------------------+-------+----------+--------------------------------------------+
4 rows in set (0.00 sec)

mysql> check table bkgdelem;
+-------------------+-------+----------+----------+
| Table             | Op    | Msg_type | Msg_text |
+-------------------+-------+----------+----------+
| moon2003.bkgdelem | check | status   | OK       |
+-------------------+-------+----------+----------+
1 row in set (0.03 sec)

mysql> check table textentry;
+--------------------+-------+----------+----------+
| Table              | Op    | Msg_type | Msg_text |
+--------------------+-------+----------+----------+
| moon2003.textentry | check | status   | OK       |
+--------------------+-------+----------+----------+
1 row in set (0.08 sec)

mysql>
[16 Sep 2003 14:51] Lenz Grimmer
Please try to reproduce this with 4.0.15 as well. If you still can reproduce it,
please upload your test case to the site mentioned in 

http://www.mysql.com/doc/en/Bug_reports.html

Also make sure that the file name of the archive refers to this bug number.
Thanks in advance! We appreciate your help in investigating this.
[17 Sep 2003 15:57] Dave Dyer
The test case constructed with 4.0.14 no longer fails with 4.0.15,
so the omens are good.  It's possible that 4.0.15 doesn't eliminate
the problem but only perturbs something enough to change it, so I
will upgrade our servers and continue monitoring them.
[18 Sep 2003 10:33] Sergei Golubchik
Still, "deleted link corrupted" looks exactly like the bug that was fixed in 4.0.15 (there was nothing similar for the very long time).

So, I'll close this bug, but if you'll experience this corruption again - don't hesitate to reopen it.