MySQL Bugs: #25506: MySQL + LARGE innodb = thrashing HDD

Bug #25506	MySQL + LARGE innodb = thrashing HDD
Submitted:	9 Jan 2007 22:43	Modified:	20 Feb 2007 14:15
Reporter:	Daniel Fiske	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S5 (Performance)
Version:	5.0.27	OS:	Windows (Win XP Prof)
Assigned to:	Heikki Tuuri	CPU Architecture:	Any
Tags:	file i/o, innodb, ntfs

Description:
I have the following scenario.

A mysql database running 3 databases. It is version 5.0.27 on Windows XP Prof.. All innodb databases. The one database is particularly large (7.8GB of data)...pretty much held in 1 table....there are probabably 30 tables in the rest of the databases....combined they probably take up 200MB. The machine is pretty well spec'ed AMD X2 4600+, 2GB RAM, SATA RAID1. Normally the services that use the databases are idle until our clients come online. Then it gets moderately busy. At the start of this a single (possibly 2) query will "hang"....it will take 8-10 minutes to complete....it is always a insert or update. During this time MySQL will write to the error log saying innodb semaphores are timing out (but that makes sense). After this time the query completes and the system runs normally.....running the same queries that it got "stuck" on for a while, but this time taking microseconds to complete.

I've done some profiling using perfmon. During this "hang" the IO writes byte for the mysql process goes from about <1MB per second to > 12MB per second. I then further ran some dianostics to see where it was writing data using filemon...and it seems to be writing a ton to c:\$logfile, which I understand is part of the NTFS transactional system. I thought it might be the swap file, but I disabled this (set it to 0MB) and the problem persists.

I have these servers set as masters for replication, so it is using binary logging.

Can anyone think why this might happen?

How to repeat:
N/A

Suggested fix:
N/A

Daniel,
I can confirm observing the same behaviour (since years) with our (simple) setup:

1 database,
5 InnoDB-tables, one of them consisting basically of blob-data. 
Seperate tablespace-files. 
5-15 simultaneous connections.

As soon as the critical "blob-only"-table touches the *magic* 7-8GB limit, the periodical trashing with heavy i/o-load starts - exactly as described by you.
The server is effectively not responsive during this period, the HD working under full-load and therefor negatively influencing other processes running on the server.

Identical setup, just with the large blob table configured as myisam and it rocks.
Current filesize: 46 GB, no remarkable performance-loss so far, occasional, well-balanced hd-access.

Do think this might be an OS issue? i.e. running on *nix might solve it? I'm got a slightly tweaked config (using some of the more obscure options) which I'm gonna try (I'll let you know if it helps). 

I was considering also implementing some agressive archiving on the replicated database and then deleting from the master to keep INNODB file size down.

D.

Thank you for a problem report. Please, send your my.ini file content, your error log, and the results of SHOW INNODB STATUS next time when you'll see this problem.

Originally the problem appeared on 5.0.26 and then I upgraded to 5.0.27 and it persisted. Here is the INNODB STATUS as printed out into the error log from 5.0.26.

=====================================
070103 19:05:29 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 46 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 631, signal count 630
--Thread 7456 has waited at .\btr\btr0cur.c line 424 for 257.00 seconds the semaphore:
X-lock on RW-latch at 2695B800 created in file .\buf\buf0buf.c line 491
a writer (thread id 7456) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1
Last time read locked in file .\buf\buf0flu.c line 576
Last time write locked in file .\btr\btr0cur.c line 3443
Mutex spin waits 552, rounds 1082, OS waits 10
RW-shared spins 1235, OS waits 615; RW-excl spins 2, OS waits 2
------------
TRANSACTIONS
------------
Trx id counter 0 28075431
Purge done for trx's n:o < 0 28061942 undo n:o < 0 0
History list length 121
Total number of lock structs in row lock hash table 1
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 28075429, not started, OS thread id 6436
MySQL thread id 987, query id 610316 localhost 127.0.0.1 a_server
---TRANSACTION 0 28075428, not started, OS thread id 7816
MySQL thread id 70, query id 610314 localhost 127.0.0.1 a_server
---TRANSACTION 0 28075427, not started, OS thread id 4364
MySQL thread id 3, query id 610309 localhost 127.0.0.1 t_server
---TRANSACTION 0 28075430, not started, OS thread id 3020
MySQL thread id 2, query id 610320 localhost 127.0.0.1 a_server
---TRANSACTION 0 28063191, ACTIVE 257 sec, OS thread id 7456 inserting, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
3 lock struct(s), heap size 320, undo log entries 1
MySQL thread id 1798, query id 560284 localhost 127.0.0.1 acs_server update
insert into XYZ....[Cut for privacy]
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (write thread)
Pending normal aio reads: 0, aio writes: 1,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
1623 OS file reads, 4482 OS file writes, 3716 OS fsyncs
0.70 reads/s, 16384 avg bytes/read, 0.61 writes/s, 0.61 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2,
132 inserts, 52 merged recs, 17 merges
Hash table size 2212699, used cells 21939, node heap has 23 buffer(s)
926.83 hash searches/s, 236.47 non-hash searches/s
---
LOG
---
Log sequence number 1 2924843445
Log flushed up to   1 2924843435
Last checkpoint at  1 2924703107
0 pending log writes, 0 pending chkp writes
2271 log i/o's done, 0.61 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 582279544; in additional pool allocated 998400
Buffer pool size   32768
Free buffers       31178
Database pages     1567
Modified db pages  93
Pending reads 0
Pending writes: LRU 0, flush list 7, single page 0
Pages read 1558, created 9, written 1594
0.67 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
1 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread id 3452, state: flushing buffer pool pages
Number of rows inserted 502, updated 276, deleted 164, read 9845909
3.26 inserts/s, 0.52 updates/s, 2.93 deletes/s, 2962.26 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Warning: a long semaphore wait:
--Thread 7456 has waited at .\btr\btr0cur.c line 424 for 273.00 seconds the semaphore:
X-lock on RW-latch at 2695B800 created in file .\buf\buf0buf.c line 491
a writer (thread id 7456) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1
Last time read locked in file .\buf\buf0flu.c line 576
Last time write locked in file .\btr\btr0cur.c line 3443
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

In the last 24 hours I have changed my.ini and it appears to have improved. It is still early, but here are the changes I made in the format variable=before/after; NA means was not previously specified (Default applies)

binlog_cache_size=NA/1M
innodb_additional_mem_pool_size=10M/16M
innodb_buffer_pool_size=512M/512M
innodb_file_io_threads=NA(def4)/8
innodb_flush_log_at_trx_commit=1/1
innodb_log_buffer_size=5M/8M
innodb_log_file_size=256M/256M
innodb_thead_concurrency=4/8
join_buffer_size=NA/8M
key_buffer_size=13M/16M
log_long_format=NA/<enabled>
long_query_time=NA/4
max_allowed_packet=NA/2M
max_connections=800/25
max_heap_table_size=NA/64M (uses min(max_heap_table_size,tmp_table_size) anyway)
myisam_max_extra_sort_file_size=100G/removed(deprecated)
myisam_max_sort_file_size=100G/2G
myisam_sort_buffer_size=29M/32M
query_cache_size=81M/64M
read_buffer_size=64K/1M
read_rnd_buffer_size=256K/2M
sort_buffer_size=256K/1M
table_cache=1520/1024
thread_cache_size=38/8
thread_concurrency=NA/8
thread_stack=NA/128K
tmp_table_size=29M/64M
innodb_autoextend_increment=NA(def:8M)/16M

I know many of these setting are MyISAM only....I just wanted to get them more in line anyway.

If these changes have fixed anything I'd hazard a guess that its either 

innodb_thead_concurrency OR
innodb_file_io_threads OR BOTH....

Increasing innodb_file_io_threads seems likely because it gives additional R/W File IO threads to InnoDB.

Thoughts?

Anyway, I'll keep you posted if it fixes it for an extended time.

D.

Daniel,

are there some tuning parameters for NTFS file system in Windows? I think I have seen the same semi-freeze of Windows a few years back.

InnoDB probably cannot do much if it is the logging process of the file system monopolizes the computer for several minutes.

Could this be hardware-related?

Regards,

Heikki

hello heikki,
the disk access to $LogFile occurs with these I/O-attributes:
Non-cached, Paging I/O, Synchronous Paging

So I assume it is caused by heavy access to paging-file.

Hardware related?
Can't imagine. The problem showed up on several systems with different RAM and harddisk configs.

On a test-system here, mysqld-nt-max regularly triggers a GPF after a few minutes of hd-thrashing. (see attached logfiles).

To me, it seems to be more filesize-related. Also OPTIMIZE TABLE delays the occurance of the next "trashing-cycle" a bit.

Err-logs:
http://aegaeon.de/downloads/logs/

Armin,

the crash is expected: InnoDB noticed that it has probably hung and intentionally generates a seg fault.

During the thrashing, does the Task Manager Performance sheet show Commit Charge bigger than the physical memory of the computer? In that case this might indeed be virtual memory paging. Does the Processes sheet show some processes with a big VM Size?

You could make innodb_buffer_pool_size much smaller if you are actually using just 24 MB of it, as shown in Daniel Fiske's printout:

Buffer pool size   32768
Free buffers       31178
Database pages     1567

Regards,

Heikki

I have both Filemon (shows masses of writes to $LogFile) and perfmon (shows VERY high Write IO) Logs (LARGE)....if you are interested. I had switched off my system swap file and problem persisted.

D.

Heikki,

I unfortunately I wasn't tracking Commit Charge, during the perfmon, however I did have taskmanager open and it never ran over 1.2GB/2GB (...this is with swap turned off).

Just to clarify something you said 

"You could make innodb_buffer_pool_size much smaller if you are actually using
just 24 MB of it"

Is this calculated by

(Total memory allocated/Buffer pool size)*(Buffer pool size-Free buffers)

D.

FYI:48 hours with new config and no problems.

Daniel,

Buffer pool size   32768
Free buffers       31178

the above means that only 1590 pages of the buffer pool are used. One page is 16 kB. 16 kB * 1590 = 24.84 MB.

Though if the problem is in the NTFS file system, changing the buffer pool size may have no effect.

####

Daniel I am no expert on Filemon and perfmon. You could Google the web and the newsgroups to find out if this problem is widely known.

I am putting this to the 'Can't repeat' status. Let us keep this "NTFS thrashing" in mind, if someone else bumps into this. I think I have not got other reports of an extreme OS freeze.

Regards,

Heikki

Heikki,
commit charge at trash-time is 286M /1228M
physical mem is 512M

In contrary to daniels setup, my logs show another situation concerning buffer pool size:
Buffer pool size   2048
Free buffers       0
Database pages     2047
Modified db pages  83

I switched now to mysqld-debug and increased buffer-pool to 128M.
Additionally all other mem-consuming processes on the server-host are stopped.

I'll take a look into the debugger, when the incident appears next.
If you're interested, I could also enable tcp/ip-remote-debugging on this dedicated host.

Regards,
Armin.

Armin,

yes, please post here all the relevant information you have. These OS-specific problems often take years to diagnose (and usually resolve before we find the cause). The more information we have the better.

Regards,

Heikki

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

UPDATE:

After my configuration changes I have not had this problem and my Innodb file size is now over 14GB....

I would guess that there may be a locking issue with innodb and windows and that changing 

innodb_thead_concurrency OR
innodb_file_io_threads OR BOTH....

will help on Windows systems.

....for anyone encountering the problem, I suggest looking at the changes I made to my config file and trying to apply to their config.

D.

Daniel,

I wonder how changing the number of I/O threads or thread concurrency can affect "NTFS thrashing". But it is possible: anything can make the OS kernel to behave in a slightly different way.

I am putting this bug report to the state 'Can't repeat' as we wait for more reports of NTFS thrashing...

Regards,

Heikki

Hello all,

tuning thread-concurrency or i/o thread-count had no influence with my setup,
but switching from NTFS to raw disk partitions (non-buffered I/O ) did (trashing disappeared, whereas overall mysql-performance was not very satisfying).

I also saw a positive influence (trashing appeared later at higher file-sizes) when manually increasing the NTFS log-file (tried with double the file-system default).

A few debugging-sessions with mysqld-debug showed no locking or thread-serialization issues during the trashing periods.

Regards,
Armin.

I had a similar problem:
Windows XP + InnoDB --> HDD thrashing.

After some reading and trying I found the reason for my problems:
I had to deactivate the service "C:\WINDOWS\system32\cisvc.exe" (in German "Indexdienst").

I hope this helps someone.

See bug: http://bugs.mysql.com/bug.php?id=48192.