| Bug #47055 | unconditional exit(1) on ERROR_WORKING_SET_QUOTA 1453 (0x5AD) for InnoDB backend | ||
|---|---|---|---|
| Submitted: | 2 Sep 10:11 | Modified: | 12 Nov 5:00 |
| Reporter: | George Danchev | ||
| Status: | Closed | ||
| Category: | Server: InnoDB | Severity: | S2 (Serious) |
| Version: | 5.0.51b | OS: | Microsoft Windows |
| Assigned to: | Satya B | Target Version: | |
| Triage: | Triaged: D2 (Serious) | ||
[2 Sep 10:11]
George Danchev
[3 Sep 18:06]
Sveta Smirnova
Thank you for the report. > 1) Is that return code of ERROR_WORKING_SET_QUOTA 1453 (0x5AD) bogus as returned by the operating system, since we do not even use quotas? We can't know this, because this can be some problem in your environment. Please check OS log files. Also version 5.0.51 is old and many bugs were fixed since. It makes sense to upgrade to current version 5.0.85. > 2) Why the rest of possible return values of GetLastError() are not checked as well, but an unconditional exit(1) is preffered? As I understand this is not possible to check every single OS error, because always would be unexpected things. There is a sense to do it only for particular errors which can be fixed while server is running.
[4 Sep 10:13]
George Danchev
> Thank you for the report. > > 1) Is that return code of ERROR_WORKING_SET_QUOTA 1453 (0x5AD) bogus as > > returned by the operating system, since we do not even use quotas? > We can't know this, because this can be some problem in your environment. > Please check OS log files. Also version 5.0.51 is old and many bugs > were fixed since. As I already wrote, we faced that problem on several of our servers. There are some more bugs reported as ours, though not in your bug tracking system: http://forums.mysql.com/read.php?132,237235,237235 http://forums.theplanet.com/index.php?showtopic=85485 > It makes sense to upgrade to current version 5.0.85. The relevant code found in os/os0file.c is the same. It is the same even in 6.0. > > 2) Why the rest of possible return values of GetLastError() are not checked > > as well, but an unconditional exit(1) is preffered? > As I understand this is not possible to check every single OS error, because > always would be unexpected things. There is a sense to do it only for > particular errors which can be fixed while server is running. How it is possible for MSSQL to deal with that, but MySQL can not? Please, have a deeper look at the example I provided in my previous message.
[4 Sep 10:59]
Sveta Smirnova
Thank you for the feedback. > As I already wrote, we faced that problem on several of our servers. MySQL just uses OS error code. This is OS what prevent io operation. This is why I ask to check OS error log. > The relevant code found in os/os0file.c is the same. It is the same even in 6.0. If this is MySQL bug real problem is not in error handling, but wrong operation before error occurs. It is not clear from description if this is the case, but can be occasionally solved. > How it is possible for MSSQL to deal with that, but MySQL can not? Please, have a deeper look at the example I provided in my previous message. I can verify this report as feature request "Please handle ERROR_WORKING_SET_QUOTA error". Let us know if this is your only concern regarding to this situation.
[4 Sep 11:56]
George Danchev
> MySQL just uses OS error code. This is OS what prevent io operation. This is > why I ask to check OS error log. Okay, that makes sense. We will be waiting for the problem to occur, and then grab Event Viewer | System and Application logs. > > The relevant code found in os/os0file.c is the same. It is the same even in > >6.0. > If this is MySQL bug real problem is not in error handling, but wrong > operation before error occurs. It is not clear from description if > this is the case, but can be occasionally solved. It occures upon os_file_write() operation. If the 'write operation' fails, then why not just sleep() for a while and try again. > > How it is possible for MSSQL to deal with that, but MySQL can not? Please, > >have a deeper look at the example I provided in my previous message. > I can verify this report as feature request "Please handle > ERROR_WORKING_SET_QUOTA error". > Let us know if this is your only concern regarding to this situation. It makes no big difference how you classified the bug-report, when your SQL daemon performs exit(1) and leave the clients speechless. Yes, it is a request to handle ERROR_WORKING_SET_QUOTA and all possible return codes of GetLastError(), since they are already a stable set of codes available at: http://msdn.microsoft.com/en-us/library/ms681381(VS.85).aspx I know that it is not always possible to recover, but the codes MySQL currently handle is just far from a sensible set of possible failures. Otherwise, either we change the daemon or the OS as well. Sad, but true.
[4 Sep 20:34]
Sveta Smirnova
Thank you for the feedback. Verified as feature request "Please handle ERROR_WORKING_SET_QUOTA more smart, for example, like reporter suggested in the initial description".
[14 Oct 16:39]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091014143611-cphb0enjlx6lpat1) (version source revid:satya.bn@sun.com-20091009140218-24h3v55dgsxgs609) (merge vers: 5.1.40) (pib:13)
[22 Oct 8:36]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 9:09]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091019131022-2o2ymjfjjoraq833) (merge vers: 5.5.0-beta) (pib:13)
[3 Nov 0:11]
Calvin Sun
Before the fix, when having a failed IO operation with return code of ERROR_WORKING_SET_QUOTA from the Windows operating system, InnoDB will intentionally crash the server. Now, InnoDB will sleep for 100ms and retry the failed operation.
[12 Nov 5:00]
Paul DuBois
Noted in 5.1.41, 5.5.0, 6.0.14 changelogs. On WIndows, when a failed I/O operation occurred with return code of ERROR_WORKING_SET_QUOTA, InnoDB intentionally crashed the server. Now InnoDB sleeps for 100ms and retries the failed operation.
