Bug #54894 Alter online table crashes mysqld process
Submitted: 29 Jun 2010 13:15 Modified: 1 Jul 2010 19:13
Reporter: Dirk Bonenkamp Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.1.3 OS:Linux (Ubuntu 10.04 (lucid))
Assigned to: CPU Architecture:Any
Tags: alter online table, crash

[29 Jun 2010 13:15] Dirk Bonenkamp
Description:
My cluster:

1 management node
2 data / sql nodes

Desired result:
Alter a table (add a column)

Actual result:
mysqld crashes and the table is not altered

How to repeat:
mysql> show create table t1 \G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `userid` int(11) NOT NULL DEFAULT '0',
  `data` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`userid`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1
/*!50100 PARTITION BY KEY (userid) */
1 row in set (0.00 sec)

mysql> ALTER ONLINE TABLE t1 ADD COLUMN test VARCHAR(10);
ERROR 2013 (HY000): Lost connection to MySQL server during query

Tried different tables, datatypes, etc. All the same. Other people (mysql forum) tried this and cannot reproduce this crash. My cluster is a test setup, remote access can be arranged if desired.

Suggested fix:
I have no idea...
[29 Jun 2010 13:16] Dirk Bonenkamp
ndb_error_report output

Attachment: ndb_error_report_20100629151344.tar.bz2 (application/octet-stream, text), 99.57 KiB.

[30 Jun 2010 4:05] Jonas Oreland
Hi,

The error reporter did not contain much.

Could you
1) add "ndb-extra-logging=99" to your my.cnf (i.e for mysqld)
2) rerun the crashing alter
3) upload the error-log from the mysqld

Also,
Did you compile this your self, or is it a binary release ?

/Jonas
[30 Jun 2010 6:29] Dirk Bonenkamp
Hi, 

Did compile myself (i686), config line:

CFLAGS="-O3" CXX=gcc CXXFLAGS="-O3 -felide-constructors -fno-exceptions -fno-rtti" ./configure --prefix=/usr --libexecdir=/usr/sbin --with-plugins=max --enable-assembler --with-mysqld-ldflags=-all-static

I'll upload the log right away.

Regards,

Dirk
[30 Jun 2010 6:31] Dirk Bonenkamp
Log file (only today)

Attachment: sql1.err (application/octet-stream, text), 30.45 KiB.

[30 Jun 2010 6:32] Jonas Oreland
Hi,

Which gcc version is used with "Linux (Ubuntu 10.04 (lucid))"
We've seen problems with "very new" gcc versions and -O3 before.

Maybe try compiling -O1 too see if problem is indeed related to
the actual compiling...

/Jonas
[30 Jun 2010 6:36] Dirk Bonenkamp
Hi Jonas,

gcc-4.4.3, quite new. I'll recompile everything with -O1. Will post the results.

Thanks,

Dirk
[30 Jun 2010 7:50] Dirk Bonenkamp
Recompiled, installed, rebooted everything and tried again. No luck, mysqld still crashes. Log snippet below.

Dirk

100630  9:47:50 [ERROR] NDB: programming error, no lock taken while running query ALTER ONLINE TABLE t1 ADD COLUMN test VARCHAR(10). Message: ha_ndbcluster::alter_table_phase1
100630  9:47:50 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=1
max_threads=151
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 345919 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x9ab30d8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xb42b11dc thread_stack 0x30000
[0x840d974]
[0x80e8bb3]
[0xb7855400]
[0x84778f3]
[0x826209c]
[0x8256805]
[0x81cb72a]
[0x81d50d0]
[0x80fc500]
[0x80fcd66]
[0x80fd941]
[0x80fe7a7]
[0x80f070e]
[0x82748c0]
[0x849489e]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x9ad2cb0 = ALTER ONLINE TABLE t1 ADD COLUMN test VARCHAR(10)
thd->thread_id=4
thd->killed=NOT_KILLED
[1 Jul 2010 19:05] Dirk Bonenkamp
Hi,

I've just downloaded, compiled (with the normal -O3) and installed 7.1.4b. 

Problem is gone now!

Regards,

Dirk
[1 Jul 2010 19:13] Dirk Bonenkamp
status -> closed