Bug #3139 | Mysql crashes: "windows error 995" after several selects on a large DB | ||
---|---|---|---|
Submitted: | 10 Mar 2004 21:10 | Modified: | 18 Jun 2010 2:01 |
Reporter: | Alexey Skorokhodov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 4.1.19, 4.1.1 alpha, 5.0.27, 5.0.44, 5.4.4 | OS: | Windows (Windows 2000/XP/2003) |
Assigned to: | Satya B | CPU Architecture: | Any |
[10 Mar 2004 21:10]
Alexey Skorokhodov
[11 Mar 2004 7:56]
Heikki Tuuri
Hi! http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/system_error_c... 995 The I/O operation has been aborted because of either a thread exit or an application request. ERROR_OPERATION_ABORTED I am not sure, but I think someone else has reported the same error. My first guess is a bug in Windows/drivers/hardware. If you tune: " innodb_file_io_threads Number of file i/o threads in InnoDB. Normally, this should be 4, but on Windows disk i/o may benefit from a larger number. Numeric my.cnf parameter format. " in my.cnf or my.ini, does it have any effect on the problem? Regards, Heikki
[15 Mar 2004 22:02]
Alexey Skorokhodov
I have added "innodb_file_io_threads=20" to my "my.cnf" file. Mysql still crashes when executing those queries. :-(
[16 Mar 2004 5:32]
Heikki Tuuri
Hi! Try making the value smaller. But this is probably a bug in the OS/drivers/hardware. Regards, Heikki
[18 Mar 2004 21:29]
Alexey Skorokhodov
I updated the value to 8. The problem still exists. I have checked the hard drive with Windows HDD checker and found no errors. I have no antivirus scanners or software, which can interrupt Mysql requests to HDD. How to determine and solve the problem in this case? This only happens with Mysql, and not with any other programs.
[18 Mar 2004 21:33]
Alexey Skorokhodov
I'm updating the "Synopsis" field
[25 Mar 2004 17:30]
MySQL Verification Team
I created an InnoDB table and filled it with about 2 millions rows without to have success in repeat the behavior reported.
[29 Mar 2004 0:47]
Alexey Skorokhodov
I'm unable to reproduce the bug on another computer with the same DB installed. Looks like a bug in my Windows installation, as Heikki said. I don't know what to do with my MySQL installation - I'll try to delete INNODB data files and to recreate them. I hope this will help. Thanks for your support and time!
[29 Mar 2004 6:21]
MySQL Verification Team
Closed according your last post. Thank you for let know us about.
[17 Sep 2004 12:37]
Heikki Tuuri
Hi! A couple of weeks ago another user reported error number 995 in Windows. Still it looks like an OS bug, or faulty hardware. Regards, Heikki
[3 Jul 2006 16:36]
Ken Hanks
Can someone tell me why this bug #3139 was closed? As of July 2006 I'm still encountering it on a daily basis and I see no resolution here on this website. Is the general consensus that it is an OS bug in Windows? If so, is there a workaround??
[20 Sep 2006 11:52]
Heikki Tuuri
In some cases, the problem cause might be a SAN storage system with too small command queue for the load.
[12 Oct 2007 15:07]
Aleksey Karyakin
Hi! I've run into the same problem. There are other indications about the issue over the internet. I've read that the problem thought to be related to windows/hardware but it looks the reason is inside InnoDB. Windows Error number 995 (ERROR_OPERATION_ABORTED) indicates that an async I/O operation was cancelled. Besides "normal" I/O cancellation (with CancelIO call), windows automatically cancels any pending I/O requests when the I/O-issuing thread exists. In my case I've got a number of short query executions with mysql.exe against a large database so clients connect and disconnect frequently. Every connection forces MySQL to allocate a separate thread that lives until a connection is open. Under certain timing it is possible for a client to disconnect before all issued write operations have been completed by windows I/O sussystem. As it is a timing issue, I'm not sure if a reliable test case can be created. To see if the problem is in fact exists, I added simple tracing into the code creating and closing threads/connections and where the I/O error is handled and it's actually happenning that connection is close _before_ I/O completion. If I add os_aio_wait_until_no_pending_writes() call in the innobase_close_connection() function, the problem disappears. Simplest fix would be just adding os_aio_wait_until_no_pending_writes() as I did, however, that could cause performance problem in a high load scenario because _every_ connection would wait for I/O request of every other connection thus causing extra I/O serialization. I suggest tracking a number of outstanding write requests per each thread using either thread local storage or providing a thread-related object (e.g. ) to I/O routines. On connection termination, a worker threads would then wait until all I/O request have been processed. I'm using 5.0.45 version on Windows XP. Should the bug be reopened?
[12 Oct 2007 15:21]
Aleksey Karyakin
Please disregard what I wrote about os_aio_wait_until_no_pending_writes() - it didn't help, actually I was added Sleep(1000) when the problem disappeared.
[11 Aug 2008 2:22]
gao zhiyg
I encountered this problem also. MySQL version is 5.0.45community The windows version is Windows Server 2003 with service pack 2
[24 Jun 2009 15:16]
Vadim TKACHENKO
We observed the same problem with mysql 5.1.30, Windows 2003. I do not think it is hardware issue, as problem was the same on two different (but in identical configuration) boxes.
[9 Oct 2009 13:55]
James Day
I've reopened this bug because we seem to have an ability to repeat the problem now.
[15 Oct 2009 15:31]
Elena Stepanova
The problem is being observed periodically in our tests.
[2 Nov 2009 14:42]
James Day
Internal testing of a possible workaround fix for this is happening. No ETA at present.
[11 Nov 2009 11:33]
samir pathan
Hi All, I am facing the same ERROR "InnoDB: Operating system error number 995 in a file operation."? Is this issue resolved? I found no solution on the site. ======= 091111 14:46:59 InnoDB: Operating system error number 995 in a file operation. InnoDB: Some operating system error numbers are described at InnoDB: http://dev.mysql.com/doc/refman/5.1/en/operating-system-error-codes.html InnoDB: File name .\ibdata1 InnoDB: File operation call: 'Windows aio'. InnoDB: Cannot continue operation. InnoDB: Log scan progressed past the checkpoint lsn 0 476737329 091111 16:03:47 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 0 476737892 091111 16:03:47 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 757583, file name .\mysql-bin.000018 091111 16:03:48 InnoDB: Started; log sequence number 0 476737892 091111 16:03:48 [Note] Recovering after a crash using mysql-bin 091111 16:03:48 [Note] Starting crash recovery... 091111 16:03:48 [Note] Crash recovery finished. 091111 16:03:48 [Note] Event Scheduler: Loaded 0 events 091111 16:03:48 [Note] wampmysqld: ready for connections. Version: '5.1.32-community-log' socket: '' port: 3306 MySQL Community Server (GPL) ====
[11 Nov 2009 15:42]
Calvin Sun
Hi Samir, Sorry to hear that you have the same problem. We have been testing a fix, and hope the fix will be included in a near future release. Thanks, Calvin
[12 Nov 2009 6:53]
samir pathan
Hello Calvin, Thank you for your immediate response. I am sorry to say that this is my live server and running live transactions. I request, if you find any solution for this please let me know ASAP. Thanks in advance. Samir.
[30 Nov 2009 8:41]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92027 3213 Satya B 2009-11-30 Applying InnoDB snapshot 5.1-ss6242, part 2. Fixes BUG#3139 1. BUG#3139 - Mysql crashes: "windows error 995" after several selects on a large DB Detailed revision comments: r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995' after several selects on a large DB During stress environment, Windows AIO may fail with error code ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather crashes. The cause of the error is unknown, but likely due to faulty hardware or driver. This patch introduces a new error code OS_FILE_OPERATION_ABORTED, which maps to Windows ERROR_OPERATION_ABORTED (995). When the error is detected during AIO, the InnoDB will issue a synchronous retry (read/write). This patch has been extensively tested by MySQL support. Approved by: Marko rb://196
[30 Nov 2009 12:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/92068 3224 Satya B 2009-11-30 Applying InnoDB Plugin 1.0.6 snapshot, part 4. Fixes BUG#3139 applied revisions: r6160 Detailed revision comments: r6160 | vasil | 2009-11-11 15:33:49 +0200 (Wed, 11 Nov 2009) | 72 lines branches/zip: Merge r6152:6159 from branches/5.1: (r6158 was skipped as an equivallent change has already been merged from MySQL) ------------------------------------------------------------------------ r6154 | calvin | 2009-11-11 02:51:17 +0200 (Wed, 11 Nov 2009) | 17 lines Changed paths: M /branches/5.1/include/os0file.h M /branches/5.1/os/os0file.c branches/5.1: fix bug#3139: Mysql crashes: 'windows error 995' after several selects on a large DB During stress environment, Windows AIO may fail with error code ERROR_OPERATION_ABORTED. InnoDB does not handle the error, rather crashes. The cause of the error is unknown, but likely due to faulty hardware or driver. This patch introduces a new error code OS_FILE_OPERATION_ABORTED, which maps to Windows ERROR_OPERATION_ABORTED (995). When the error is detected during AIO, the InnoDB will issue a synchronous retry (read/write). This patch has been extensively tested by MySQL support. Approved by: Marko rb://196 ------------------------------------------------------------------------ r6158 | vasil | 2009-11-11 14:52:14 +0200 (Wed, 11 Nov 2009) | 37 lines Changed paths: M /branches/5.1/handler/ha_innodb.cc M /branches/5.1/handler/ha_innodb.h branches/5.1: Merge a change from MySQL: (this has been reviewed by Calvin and Marko, and Calvin says Luis has incorporated Marko's suggestions) ------------------------------------------------------------ revno: 3092.5.1 committer: Luis Soares <luis.soares@sun.com> branch nick: mysql-5.1-bugteam timestamp: Thu 2009-09-24 15:52:52 +0100 message: BUG#42829: binlogging enabled for all schemas regardless of binlog-db-db / binlog-ignore-db InnoDB will return an error if statement based replication is used along with transaction isolation level READ-COMMITTED (or weaker), even if the statement in question is filtered out according to the binlog-do-db rules set. In this case, an error should not be printed. This patch addresses this issue by extending the existing check in external_lock to take into account the filter rules before deciding to print an error. Furthermore, it also changes decide_logging_format to take into consideration whether the statement is filtered out from binlog before decision is made. added: mysql-test/suite/binlog/r/binlog_stm_do_db.result mysql-test/suite/binlog/t/binlog_stm_do_db-master.opt mysql-test/suite/binlog/t/binlog_stm_do_db.test modified: sql/sql_base.cc sql/sql_class.cc storage/innobase/handler/ha_innodb.cc storage/innobase/handler/ha_innodb.h storage/innodb_plugin/handler/ha_innodb.cc storage/innodb_plugin/handler/ha_innodb.h ------------------------------------------------------------------------
[30 Nov 2009 18:45]
sadasasd sdasd
Not quite sure if it's same issue but we are having a very similar error and I was wondering if this patch fixes this problem as well: InnoDB: Operating system error number 1784 in a file operation. InnoDB: Some operating system error numbers are described at InnoDB: http://dev.mysql.com/doc/refman/5.0/en/operating-system-error-codes.html InnoDB: File name D:\MySQLlog\ib_logfile0 InnoDB: File operation call: 'Windows aio'. InnoDB: Cannot continue operation. InnoDB: Log scan progressed past the checkpoint lsn 35 1443981126
[30 Nov 2009 18:59]
MySQL Verification Team
error 1784 means ERROR_INVALID_USER_BUFFER (The supplied user buffer is not valid for the requested operation.)
[1 Dec 2009 9:11]
Satya B
patch queued to 5.1-bugteam storage/innobase and for the plugin storage/innodb_ plugin. NULL merged to 6.0 and will be merged to 5.5.*
[2 Dec 2009 8:06]
Bugs System
Pushed into 5.1.42 (revid:joro@sun.com-20091202080033-mndu4sxwx19lz2zs) (version source revid:satya.bn@sun.com-20091130120409-pe1abptka1mlq9qy) (merge vers: 5.1.42) (pib:13)
[2 Dec 2009 22:58]
James Day
Paul, you might consider this for the changelog text: On some Windows systems InnoDB could report "Operating system error number 995 in a file operation" due to transient driver or hardware problems. InnoDB now retries the operation and adds "Retry attempt is made" to the error message when it does so. For others, we've seen that these sometimes show up when there's an increase in hard drive S.M.A.R.T. counters, implying a hard drive root cause. But this is too much detail for this changelog entry.
[4 Dec 2009 1:42]
Paul DuBois
Noted in 5.1.42 changelog. Setting report to NDI pending push to 5.6.x.
[16 Dec 2009 8:36]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091214191830-wznm8245ku8xo702) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:43]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:satya.bn@sun.com-20091202140050-nh3ebk6s3bziv8cb) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:49]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[16 Dec 2009 15:27]
Paul DuBois
Noted in 5.5.1, 6.0.14 changelogs.
[16 Dec 2009 21:14]
James Day
sadasasd sdasd, I've opened bug #49748 for that error number 1784 error message. A possible workaround is a larger innodb_log_file_size and/or innodb_buffer_pool_size . Please carry out any further discussion over there and let us know whether the workaround is effective for you.
[12 Mar 2010 14:08]
Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:24]
Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:38]
Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[5 May 2010 15:06]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:07]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:35]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:03]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:41]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[15 Jun 2010 8:10]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:25]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:10]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:57]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:38]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[2 Dec 2010 5:20]
MySQL Verification Team
is it possible that the scenario mentioned in "[12 Oct 2007 17:07] Aleksey Karyakin" could happen? a user thread initiates an IO then quits? or is that only ever done by background file i/o threads?
[7 Dec 2010 14:59]
MySQL Verification Team
AFAIK, a fix was never done for 5.0 nor pushed into it ....
[7 Dec 2010 17:05]
MySQL Verification Team
I also think that instead of just repeating the command, we could investigate why the error occurs in the first place. There are many possible causes, but I don't think that concurrency is here the problem. Instead, I think that it is possible buffer for read / write is no longer available, or that there is bug in the sync part of the low-level code. Many other options could be investigated, but to discover the cause, lot's of debug / trace info should be coded in and extracted on the error like this.