Bug #117693 | 'aio write' returned OS error 0. / Assertion failure: os0file.cc:6809:slot->is_reserved thread | ||
---|---|---|---|
Submitted: | 12 Mar 23:59 | Modified: | 21 Mar 23:20 |
Reporter: | joseph nhcs | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S1 (Critical) |
Version: | 8.0.41 | OS: | Windows (Server 2019 HyperV) |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[12 Mar 23:59]
joseph nhcs
[13 Mar 17:40]
MySQL Verification Team
Hi, File .\edb\members_copy.ibd: 'aio write' returned OS error 0. Cannot continue operation This is a system thing. MySQL got error from the Windows that it can't deal with that file. Nothing MySQL can do about it except report the error. This can be number of things - antivirus - some new windows protection something something in new release - virus/malware - error on the hdd - error on the hdd controller (cache memory issue) - error on the motherboard (memory issue) - full disk ... Anyhow, nothing we can do about it, we tried to access a file and windows said "no can do - error 0" and we passed that error back to you. You can talk with our MySQL Support team to help you find what is exactly the problem and how to fix it but it is not a MySQL bug so nothing we can help you through this system Good luck and thanks for using MySQL Server
[14 Mar 2:47]
joseph nhcs
I understand, and appreciate your kindly and clear reply. As unlikely as this may be, I wonder if it couldn't be MySQL misunderstanding or sending something malformed, since nothing else on the server is broken, and everything has been the same for years. For what it's worth, and I understand that the list is just an illustration: - antivirus We use Defender, as always, and it isn't showing any relevant activity. - some new windows protection something something in new release We have disabled updates and haven't changed our version of Server 2019 in a year or more. - virus/malware Possibly, but no other signs of it, and we run a lot of stuff which is all doing fine. - error on the hdd Changed to a different HDD as I mentioned, and still had the problem. - error on the hdd controller (cache memory issue) - error on the motherboard (memory issue) Of course these are hard to diagnose/find, but since nothing else is wrong and no other errors have occurred, and multiple workstations are on the same virtual host, and they are all fine, and the hardware is only a few years old and well maintained, it makes me doubt it. - full disk Disk isn't nearly full fwiw. I do get that the list isn't comprehensive but I wanted to show that we have done due diligence and I believe it is likely something like this: 1. Bug mishandling healthy busy virtual HD interaction due to lots of ODBC_32 activity, locking its own files on itself longer than it should, and crashing as if it were the OS's fault the files are locked 2. Something causing resources allotted to MySQL to momentarily run out 3. Some unusual configuration in our virtual server that causes file handling to not work as MySQL/innodb expects it to, but what would that be? I have no idea. Looking through our error log, I see "Not a valid bookmark" reported by the ODBC on doing an insert sometimes, on the same day that a crash happens. Not sure if that would be related. Could it be that when an ODBC insert fails, something doesn't unlock the file correctly, and the next time MySQL tries to use that file, it gets this error and crashes? Would there be an updated driver, or something like that, to look into? Is the lack of stack trace an indication that something is or isn't MySQL's fault? Something went terribly wrong, for sure. :) Thanks!
[14 Mar 15:03]
MySQL Verification Team
Hi, It is possible that, in case of a virtualized environment, storage subsystem returns "error" if there is no more iops available for the VM. I'm not expert in how different vm controllers operate but it is a possibility. That would be something that Windows would have to deal with and delay response instead of returning an error. When system returns an error we have to assume there is a real error. With disk errors we have to handle that seriously as it is 100000x safer to crash if that happens than to allow for data corruption. In any case MySQL is not directly communicating with hardware but talks to OS and if OS say there an error accessing device (storage in this case) there is nothing we should do differently. If this is a case than it is a Windows bug. There's nothing we can do about it, and no, it is not possible we are sending something "wrong" to the Windows as that would be detected immediately by thousands of users even if it was pushed to the public, but it would hopefully be first caught by our QA team. Thanks
[21 Mar 23:20]
joseph nhcs
Thanks for your continued assistance. I moved the data folder to a network share on another machine, and the errors (so far) seem to be gone. Again, there are no other disk or similar errors on that VM, or any of the other VMs on that host. The Windows logs show nothing. An anonymous error 0 doesn't give me a lot to go on. Is there a way to get any more information from MySQL about what it is seeing and what exactly is failing, so we can try to troubleshoot further? Since Windows shows nothing amiss, I think there may be no other way to figure this out. I still lean toward the theory that MySQL could be fixed to work better on vhdx, or at least, report more details to try to help us out. As I understand it, this page is not visible because the team has not accepted that it is a MySQL bug. As a result, anyone else like me experiencing trouble can't chime in or benefit from this conversation, and remains invisible unless they submit a bug too. Is that right? Is there another way to put a post somewhere that, if anyone else has any input on it, they could share?
[22 Mar 2:21]
MySQL Verification Team
> As I understand it, this page is not visible because the team has not accepted that it is a MySQL bug. No, quite the opposite. > Is there a way to get any more information from MySQL about what it is seeing and what exactly is failing No, we pass-through the error we get from the OS. Now there probably is a way to force Win to log this kind of errors in some kind of log but that is out of the scope of this bugs system and is a Win-OS thing, not a MySQL thing. > I still lean toward the theory that MySQL could be fixed to work better on vhdx, Everything can always be made better... thing is that with disk io bound systems like databases there needs to be a decision made what's more important, verbosity increases lag.. also, do you want to retry IO operations if that means that maybe storage is dying and your data is maybe unsafe. Our decision is that if IO returns error it is safer to not store that data and fail that transaction than to retry and save it to unsafe storage. In theory this could be configurable, in practice it would make system less stable/durable.