Bug #20595 MySQL 5.1.11 beta seems to drop connections under high load on Sun Fire
Submitted: 21 Jun 2006 9:00 Modified: 3 Aug 2006 21:35
Reporter: Arjen lastname Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1.11 beta OS:Solaris (Solaris 10 sparc)
Assigned to: Assigned Account

[21 Jun 2006 9:00] Arjen lastname
Description:
This problem is a bit difficult to explain and reproduce, so if anything is unclear, please ask me about it.

We are benchmarking a Sun Fire T2000, with a 8-core Niagara processor. As a benchmark we run a simplified version of our web application and use apache+php to connect to mysql. Using ab we then create a specific concurrent workload, i.e. ab -c 100 ... etc to create a workload with 100 concurrent processes doing queries on the MySQL-server on that T2000.

As it turns out, 4.1.20, 5.0.18 and 5.1.9 don't scale too well above 4 cores with 4 threads enabled, but keep up working. 

5.1.11 seems to have dropped connections under high concurrencies however.
When scaling up the amounts of concurrent connections from 10 to 80 all seems well. But when 90 or 100 connections are throwing work at mysql
5.1.11 for a relatively long time (10 minutes) it seems to drop connections. We couldn't for reproduce it when we set it to run for just 90 seconds.

In total, in that 10 minutes, about 2200 connections where opened of which 1500 where dropped/denied or whatever exactly happened.

Both 5.1.9 and 5.1.11 used the same databasefiles, configuration etc. 
4.1.20 and 5.0.18 used the same configuration, but other versions of the databasefiles (although the same content).

I'm not sure which other information would be helpful for you guys to fix this. Nor am I currently in the position to easily try other configurations like linux on the same T2000. Unfortunately the machine is only for limited time in our possession, so we went back to 5.1.9 and didn't do much more investigation.

How to repeat:
As said in the description, I'm not sure how to reproduce it. It may be an issue triggered by our specific workload, OS, hardware, etc.

Suggested fix:
It appears a change between 5.1.9 and 5.1.11 made things worse, I'd focus on that.
[22 Jun 2006 12:49] Valerii Kravchuk
Thank you for a problem report. Please, send my.cnf content used for testing.
[22 Jun 2006 12:51] Vadim Tkachenko
I assume you are using InnoDB tables ?
[22 Jun 2006 13:01] Arjen lastname
For various reasons we use a mix of both myisam and innodb. Currently the seperation is between whether a table is readonly or readwrite/writeonly. Where of course the latter are innodb.

I attached the my.cnf we used for testing.
[29 Jun 2006 12:42] Valerii Kravchuk
Please, try to set innodb_thread_concurrency=2 (or even innodb_thread_concurrency=1) in 5.1.11, run your checks again and inform about any difference in results. Will connections be dropped in this case?

Alternatively, can you explain what exact actions are performed by your application/provide a test case?
[3 Jul 2006 16:52] Arjen lastname
I managed to forget to have a look at the error log...

Today I did and saw there where a lot of issues reported. I'm not sure which are due to corrupt files and which are due to MySQL-bugs, so I leave that up to you guys.

The strangest thing is that 5.1.9-beta also segfaulted a few times if I'm reading the log correctly, but we didn't notice that in our benchmarks.
[23 Jul 2006 10:23] Valerii Kravchuk
I had found:

060620 12:52:08InnoDB: Assertion failure in thread 27 in file row0sel.c line 2378

at the very beginning of your error log (and several times later). Because assertion failed is the same, I'll mark this report as a duplicate of bug #20213, already verified.
[29 Jul 2006 5:57] Heikki Tuuri
I reopened this bug report since there are also other types of assertion failures.

Here InnoDB is trying to store the row id field from a record that looks like a secondary index record:

InnoDB: Error: Row id field is wrong length 4294967295 in index `NdaID` of table `tweakers/benchdb_testcombos`
InnoDB: Field number 18446744073709551615, record:
PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 2; hex 0000; asc   ;; 1: len 2; hex 01c9; asc   ;;

This looks like memory corruption:

060620 13:44:29InnoDB: Assertion failure in thread 9 in file ../include/buf0buf.ic line 624
InnoDB: Failing assertion: block->state == BUF_BLOCK_FILE_PAGE
InnoDB: We intentionally generate a memory trap.

This looks like the other bug report:

InnoDB: Error: Row id field is wrong length 4294967295 in index `PRIMARY` of table `tweakers/va_advertenties`
InnoDB: Field number 18446744073709551615, record:
[deleted]
060620 16:06:52InnoDB: Assertion failure in thread 18 in file row0sel.c line 2378
[29 Jul 2006 9:59] Heikki Tuuri
Hi!

These stress test bug reports suggest that some serious bug was introduced in 5.1.11:

http://bugs.mysql.com/bug.php?id=20213
http://bugs.mysql.com/bug.php?id=20337
http://bugs.mysql.com/bug.php?id=20595
http://bugs.mysql.com/bug.php?id=21322

Can you test is 5.1.9 exhibits the same crashes?

Regards,

Heikki
[29 Jul 2006 10:21] Arjen lastname
5.1.9 doesn't display any of the crashes we noticed in 5.1.11

Actually it ran through our testing/benchmarking just as good as 4.1.20 and 5.0.18/5.0.20a (accept for performance differences of course).

Unfortunately we aren't in the possession of the T2000-machine to do more investigation.
[30 Jul 2006 16:00] Heikki Tuuri
Ok, thank you. Then it is probable that the large patches between 5.1.9 and 5.1.11 introduced some serious bug.
[3 Aug 2006 21:28] Heikki Tuuri
Probably a duplicate of http://bugs.mysql.com/bug.php?id=20213