Bug #26433 Falcon: crash in procedure after error 1025
Submitted: 16 Feb 2007 0:09 Modified: 20 May 2007 5:16
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:5.2.4-falcon-alpha-debug OS:Linux (SUSE 10.0 / 64-bit)
Assigned to: Kevin Lewis CPU Architecture:Any

[16 Feb 2007 0:09] Peter Gulutzan
Description:
I create a procedure which creates a table,
inserts 20 rows, then goes into a loop,
updating the rows randomly and altering
character sets. After many iterations,
I see a crash.

The procedure contains a handler ("declare
continue handler for sqlexception ...").
This is necessary to skip over occurrences of
Error 1025 "Error on rename". The occurrences
of Error 1025 are also a bug, but that's not
what I'm reporting here, since I think Error
1025 is already covered by an earlier bug
report, Bug #22155 Double rename failure.

How to repeat:
Create a procedure, and call it, thus:

delimiter //
drop table tq//
drop procedure p2//
create procedure p2 ()
begin
  declare v int default 0;
  declare v10000 int;
  declare v999 int;
  declare v255 int;
  declare v9999 int;
  declare continue handler for sqlexception
  begin
    declare vx int;
    select 'error, probably 1025';
    drop table tq;
    select '1';
    create table tq (
    s0 int, s1 varchar(1000) character set latin1,
    s2 varchar(1000) character set latin2)
    engine=falcon;
    set vx = 0;
    while vx < 20 do
      insert into tq values (vx,null,null);
      set vx = vx + 1;
      end while;
    select '2';
    create index itq1 on tq (s1,s0);
    create index itq2 on tq (s2,s0);
    select '3';
    end;
  select 'a';
  create table tq (s0 int, s1 varchar(1000) character set latin1, s2 varchar(1000) character set latin2) engine=falcon;
  select 'b';
  set v = 0;
  while v < 20 do
    insert into tq values (v,null,null);
    set v = v + 1;
    end while;
  select 'c';
  create index itq1 on tq (s1,s0);
  create index itq2 on tq (s2,s0);
  select 'd';
  set v = 0;
  while v < 100000 do
    set v10000 = rand()*10000;
    set v999 = rand()*999;
    set v255 = rand()*255;
    set v9999 = rand()*9999;
    set @x = concat('update tq set s1 = repeat(0x', hex(v10000),',',v999,'), s0 =', v,', s2 = repeat(0x', hex(v255), ',',v9999,')');
    select v,@x;
    prepare stmt1 from @x;
    execute stmt1;
    if v mod 2 = 0 then
      alter table tq modify column s1 varchar(1000) character set latin2;
      alter table tq modify column s2 varchar(1000) character set latin1;
    else
      alter table tq modify column s1 varchar(1000) character set latin1;
      alter table tq modify column s2 varchar(1000) character set latin2;
      end if;
    set v = v + 1;
    end while;
  end//
call p2()//

Usually on my machine it won't run for more than
a thousand iterations.
[16 Feb 2007 1:54] Hakan Küçükyılmaz
Verified as described. The loop count on my machine could not even get to 1000 before crashing

+------+---------------------------------------------------------------------+
| v    | @x                                                                  |
+------+---------------------------------------------------------------------+
|  969 | update t1 set b = repeat(0x1096,481), a
=969, c = repeat(0x22,2294) |
+------+---------------------------------------------------------------------+
1 row in set (9 min 17.00 sec)

+----------------------+
| error, probably 1025 |
+----------------------+
| error, probably 1025 |
+----------------------+
1 row in set (9 min 17.00 sec)

+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (9 min 17.02 sec)

+---+
| 2 |
+---+
| 2 |
+---+
1 row in set (9 min 17.07 sec)

+---+
| 3 |
+---+
| 3 |
+---+
1 row in set (9 min 17.36 sec)

ERROR 2013 (HY000): Lost connection to MySQL server during query
5.2.4-falcon-alpha-debug

Best regards, Hakan
[16 Feb 2007 2:05] Hakan Küçükyılmaz
Added test case falcon_bug_26433.test and pushed to mysql-5.1-falcon tree.

Backtrace is:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1098242992 (LWP 20927)]
0x08426d60 in Transaction::addRecord (this=0x4050d4e8, record=0x4059adec) at Transaction.cpp:300
300             *recordPtr = record;
(gdb) bt
#0  0x08426d60 in Transaction::addRecord (this=0x4050d4e8, record=0x4059adec)
    at Transaction.cpp:300
#1  0x0842229b in Table::update (this=0x404ffba4, transaction=0x4050d4e8, oldRecord=0x406dc0cc,
    numberFields=1, updateFields=0x4052aaec, values=0x40510794) at Table.cpp:1005
#2  0x0848412d in NUpdate::evalStatement (this=0x4050f324, statement=0x404fc744) at NUpdate.cpp:143
#3  0x084befd2 in Nfs::Statement::start (this=0x404fc744, node=0x4050f324) at Statement.cpp:449
#4  0x08486a95 in PreparedStatement::executeUpdate (this=0x404fc744) at PreparedStatement.cpp:86
#5  0x08465394 in Index::rename (this=0x40529ad0, newName=0x41759c70 "#SQL2-51B9-2$1")
    at Index.cpp:654
#6  0x08412097 in StorageDatabase::renameTable (this=0x404fa5d4, storageConnection=0x404fa514,
    table=0x40524798, tableName=0x41759ed4 "#SQL2-51B9-2", schemaName=0x41759dd4 "TEST")
    at StorageDatabase.cpp:549
#7  0x084172c9 in StorageTableShare::renameTable (this=0x43341a9c, storageConnection=0x404fa514,
    newName=0x4175a44f "./test/#sql2-51b9-2") at StorageTableShare.cpp:170
#8  0x0840b63d in NfsStorageTable::rename_table (this=0x89c8eb8, from=0x4175a64f "./test/t1",
    to=0x4175a44f "./test/#sql2-51b9-2") at ha_falcon.cpp:1087
#9  0x08370f5d in mysql_rename_table (base=0x893ae20, old_db=0x89c1318 "test",
    old_name=0x89c1148 "t1", new_db=0x89c1318 "test", new_name=0x4175b888 "#sql2-51b9-2", flags=2)
    at sql_table.cc:3708
#10 0x08377147 in mysql_alter_table (thd=0x896a470, new_db=0x89c1318 "test",
    new_name=0x89c1148 "t1", lex_create_info=0x89b989c, table_list=0x89c1170, fields=@0x89b97b4,
    keys=@0x89b97a8, order_num=0, order=0x0, ignore=false, alter_info=0x89b9a28, do_send_ok=true)
    at sql_table.cc:6525
#11 0x0827d9c3 in mysql_execute_command (thd=0x896a470) at sql_parse.cc:3213
#12 0x083dae1f in sp_instr_stmt::exec_core (this=0x89c13a8, thd=0x896a470, nextp=0x4175c444)
    at sp_head.cc:2576
#13 0x083dabd9 in sp_lex_keeper::reset_lex_and_exec_core (this=0x89c13d0, thd=0x896a470,
    nextp=0x4175c444, open_tables=false, instr=0x89c13a8) at sp_head.cc:2445
#14 0x083de9bd in sp_instr_stmt::execute (this=0x89c13a8, thd=0x896a470, nextp=0x4175c444)
    at sp_head.cc:2527
#15 0x083d88c9 in sp_head::execute (this=0x89943c0, thd=0x896a470) at sp_head.cc:1078
#16 0x083dd0e9 in sp_head::execute_procedure (this=0x89943c0, thd=0x896a470, args=0x896a9d0)
    at sp_head.cc:1711
#17 0x082822e1 in mysql_execute_command (thd=0x896a470) at sql_parse.cc:4698
#18 0x0828411c in mysql_parse (thd=0x896a470, inBuf=0x8996460 "call p1()", length=9)
    at sql_parse.cc:6162
#19 0x08284b27 in dispatch_command (command=COM_QUERY, thd=0x896a470, packet=0x898e401 "",
    packet_length=10) at sql_parse.cc:1857
#20 0x08285b8e in do_command (thd=0x896a470) at sql_parse.cc:1626
#21 0x0828603f in handle_one_connection (arg=0x896a470) at sql_parse.cc:1232
#22 0x40284297 in start_thread () from /lib/tls/libpthread.so.0
#23 0x4021937e in clone () from /lib/tls/libc.so.6
#24 0x4175dbb0 in ?? ()

(gdb) f 0
#0  0x08426d60 in Transaction::addRecord (this=0x4050d4e8, record=0x4059adec)
    at Transaction.cpp:300
300             *recordPtr = record;
(gdb) p *recordPtr
Cannot access memory at address 0xeeeeeeee
(gdb) p record
$3 = (class RecordVersion *) 0x4059adec
(gdb) 

Regards, Hakan
[19 Feb 2007 15:09] Hakan Küçükyılmaz
Modified test case to use a seed for rand() to get better reproducibility.

DROP TABLE IF EXISTS t1;
DROP PROCEDURE IF EXISTS p1;

# Seed for rand() to get more predictable code path.
SET @a = 707;

DELIMITER //
CREATE PROCEDURE p1 ()
BEGIN
  DECLARE v int default 0;
  DECLARE v1000 int;
  DECLARE v999  int;
  DECLARE v255  int;
  DECLARE v99   int;
  DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
  BEGIN
    DECLARE vx int;
    SELECT 'error, probably 1025';
    DROP TABLE t1;
    SELECT '1';
    CREATE TABLE t1 (
      a int,
      b varchar(500) character set latin1,
      c varchar(500) character set latin2
    );
    SET vx = 0;
    WHILE vx < 20 do
      INSERT INTO t1 VALUES (vx, null, null);
      SET vx = vx + 1;
    END WHILE;
    SELECT '2';
    CREATE INDEX i1 ON t1 (b, a);
    CREATE INDEX i2 ON t1 (c, a);
    SELECT '3';
  END;
  SELECT 'a';
  CREATE TABLE t1 (
    a int,
    b varchar(500) character set latin1,
    c varchar(500) character set latin2
  );
  SELECT 'b';
  SET v = 0;
  WHILE v < 20 do
    INSERT INTO t1 VALUES (v, null, null);
    SET v = v + 1;
  END WHILE;
  SELECT 'c';
  CREATE INDEX i1 ON t1 (b, a);
  CREATE INDEX i2 ON t1 (c, a);
  select 'd';
  SET v = 0;
  while v < 2500 do
    SET v1000 = rand(@a) * 1000;
    SET v999  = rand(@a) * 999;
    SET v255  = rand(@a) * 255;
    SET v99   = rand(@a) * 99;
    SET @x = concat('update t1 set b = repeat(0x', hex(v1000),',',v999,'), a =', v,', c = repeat(0x', hex(v255), ',',v99,')');
    /* SELECT v, @x; */
    PREPARE stmt1 FROM @x;
    EXECUTE stmt1;
    IF v mod 2 = 0 THEN
      ALTER TABLE t1 MODIFY COLUMN b varchar(500) character set latin2;
      ALTER TABLE t1 MODIFY COLUMN c varchar(500) character set latin1;
    ELSE
      ALTER TABLE t1 MODIFY COLUMN b varchar(500) character set latin1;
      ALTER TABLE t1 MODIFY COLUMN c varchar(500) character set latin2;
    END IF;
    SET v = v + 1;
  END WHILE;
END//

CALL p1()//
[14 Apr 2007 1:50] Hakan Küçükyılmaz
Does not crash anymore.

Regards,

Hakan
[15 Apr 2007 20:45] Peter Gulutzan
I tried again today with the latest build
ChangeSet@1.2598, 2007-04-15

It crashed after 1518 iterations.
[15 Apr 2007 21:02] Hakan Küçükyılmaz
Can't reproduce:

Test(s) which will be run though they are marked as disabled:
  falcon_bug_26433     : Bug#26433 2007-02-16 hakank Currently failing

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

falcon_bug_26433               [ pass ]         412027
-------------------------------------------------------
Stopping All Servers
All 1 tests were successful.
The servers where restarted 1 times
Spent 412.027 seconds actually executing testcases

Linux 64-bit, change set 1.2599, 2007-04-15 13:58:53-04:00, jas@fluffy.netfrastructure.com +1 -0
[16 Apr 2007 1:23] Peter Gulutzan
It's still crashing.

Linux 64-bit, change set 1.2599, 2007-04-15 13:58:53-04:00,
jas@fluffy.netfrastructure.com +1 -0

Perhaps someone could try to run the original script for
several hours.
[16 Apr 2007 8:23] Hakan Küçükyılmaz
Crashes now with random seed of 909.

Program received signal SIGSEGV, Segmentation fault.
Index::getDeferredIndex (this=0x2aaaaacfd418, transaction=0x2aaaaacbc4b8)
    at Index.cpp:222
222                             if ((deferredIndex->index == this) && (deferredI
ndex->virtualOffset == 0))
(gdb) bt
#0  Index::getDeferredIndex (this=0x2aaaaacfd418, transaction=0x2aaaaacbc4b8)
    at Index.cpp:222
#1  0x00000000007cba10 in Index::insert (this=0x2aaaaacfd418, key=0x4087d280, 
    recordNumber=18463, transaction=0x2aaaaacbc4b8) at Index.cpp:189
#2  0x00000000007cba7e in Index::update (this=0x2aaaaacfd418, 
    oldRecord=0x2aaaadb74210, record=0x2aaaadb75800, 
    transaction=0x2aaaaacbc4b8) at Index.cpp:499
#3  0x000000000079d46b in Table::update (this=0x2aaaaacfce38, 
    transaction=0x2aaaaacbc4b8, oldRecord=0x2aaaadb74210, numberFields=1, 
    updateFields=0x2aaaaad366b8, values=0x2aaaaacb7888) at Table.cpp:1021
#4  0x00000000007e27bb in NUpdate::evalStatement (this=0x2aaaaad400c8, 
    statement=0x2aaaaad48bc8) at NUpdate.cpp:143
#5  0x00000000008127ac in Nfs::Statement::start (this=0x2aaaaad48bc8, 
    node=0x2aaaaad400c8) at Statement.cpp:451
#6  0x00000000007e469d in PreparedStatement::executeUpdate (
    this=0x2aaaaacfd418) at PreparedStatement.cpp:86
#7  0x00000000007cc15e in Index::rename (this=0x2aaaaad4a678, 
    newName=0x4087dac0 "T1$0") at Index.cpp:700
#8  0x00000000007901f6 in StorageDatabase::renameTable (
    this=<value optimized out>, storageConnection=0x2aaaaacf6c08, 
    table=0x2aaaaad37e00, tableName=0x2aaaaacaf374 "T1", 
    schemaName=0x2aaaaad37bcc "TEST") at StorageDatabase.cpp:595

(gdb) f 0
#0  Index::getDeferredIndex (this=0x2aaaaacfd418, transaction=0x2aaaaacbc4b8)
    at Index.cpp:222
222                             if ((deferredIndex->index == this) && (deferredIndex->virtualOffset == 0))
(gdb) l
217                     {
218                     for (deferredIndex = transaction->deferredIndexes; 
219                              deferredIndex; 
220                              deferredIndex = deferredIndex->nextInTransaction)
221                             {
222                             if ((deferredIndex->index == this) && (deferredIndex->virtualOffset == 0))
223                                     break;
224                             }
225                     }
226
(gdb) p this
$1 = (Index * const) 0x2aaaaacfd418
(gdb) p deferredIndex
$2 = (DeferredIndex *) 0xeeeeeeeeeeeeeeee
(gdb)
[21 Apr 2007 4:08] Kevin Lewis
I finally figured out why the transaction was being removed in the middle of being used.  It was the system transaction and the scavenger thread was calling commitSystemTransaction while another thread was doing a renameTable().  Specifically, Index::rename() was not protected with a shared lock on Database::syncSysConnection.  The other two rename functions in renameTable, Table::rename() and Sequence::rename(), were being protected. Instead of adding that lock on Index::rename(), I put the lock up higher, in StorageDatabase::renameTable().  Then I deleted the lock from Table::rename() and Sequence::rename().  This allows the entire StorageDatabase::renameTable() to be done in a single transaction. 

In addition, I am using incrementing the Transaction::useCount when the transaction is assigned to a Statement object, and decrementing it when that object is destroyed.  Then at the bottom of Transaction::commit, where unused Transactions are destroyed, I don't allow the transaction to be destroyed until the useCount gets back down to 1.  In this way, the Transaction is not destroyed while there are any other references to it, at least from a Statement object.
[30 Apr 2007 10:20] Hakan Küçükyılmaz
Passes now with random seed of 909, too.

Best regards,

Hakan
[30 Apr 2007 10:25] Hakan Küçükyılmaz
Passes now with random seed of 909, too.

Best regards,

Hakan
[30 Apr 2007 14:08] MC Brown
A note has been added to the 6.0.0 changelog.
[2 May 2007 10:05] Hakan Küçükyılmaz
Peter,

I followed your advise and run the your initial test. It does not crash the server anymore but it get's slower and slower:

+-------+-------------------------------------------------------------------------+
| v     | @x                                                                      |
+-------+-------------------------------------------------------------------------+
| 38650 | update tq set s1 = repeat(0x2395,291), s0=38650, s2 = repeat(0xB9,7458) |
+-------+-------------------------------------------------------------------------+
1 row in set (1 day 1 hour 9 min 59.74 sec)

+-------+-------------------------------------------------------------------------+
| v     | @x                                                                      |
+-------+-------------------------------------------------------------------------+
| 38651 | update tq set s1 = repeat(0x15C6,549), s0=38651, s2 = repeat(0x13,7264) |
+-------+-------------------------------------------------------------------------+
1 row in set (1 day 1 hour 10 min 4.20 sec)

+-------+------------------------------------------------------------------------+
| v     | @x                                                                     |
+-------+------------------------------------------------------------------------+
| 38652 | update tq set s1 = repeat(0xFEC,858), s0=38652, s2 = repeat(0x12,7768) |
+-------+------------------------------------------------------------------------+
1 row in set (1 day 1 hour 10 min 8.67 sec)

+-------+------------------------------------------------------------------------+
| v     | @x                                                                     |
+-------+------------------------------------------------------------------------+
| 38653 | update tq set s1 = repeat(0x1A48,33), s0=38653, s2 = repeat(0x25,6356) |
+-------+------------------------------------------------------------------------+
Segmentation fault

After one day mysql client crashed.

If you think the performance or the client crash is a bug, feel free to open a new bug about it.

Best regards,

Hakan
[2 May 2007 18:59] Peter Gulutzan
I now encounter a 'recovery failure' problem using this procedure.

How to repeat:

Create procedure and call it as originally described.
Let it run for at least 5 seconds.

Elsewhere, find out the mysqld process number and kill it, e.g.
"
linux:/home/pgulutzan # ps -A | grep mysqld
12471 pts/1    00:00:02 mysqld
linux:/home/pgulutzan # kill -9 12471
"

Restart mysqld.

Restart mysql client. Say 'use (databasename)' and/or 'select * from tq;'.

I tried that 3 times, and got a crash 3 times.

There are already bugs about recovery failure, e.g. Bug#25017,
but I like this example because I think/hope it will be easy to reproduce.
[9 May 2007 16:44] Kevin Lewis
This bug has been fixed, mutated, and re-opened so many times I don't knwo what the problem is.  There seems to be a complaint about a performance problem and a recovery issue.  Please open individual bugs for each issue.
[20 May 2007 5:17] MC Brown
Already documented in the 6.0.0 changelog.