Bug #117693 'aio write' returned OS error 0. / Assertion failure: os0file.cc:6809:slot->is_reserved thread
Submitted: 12 Mar 23:59 Modified: 21 Mar 23:20
Reporter: joseph nhcs Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:8.0.41 OS:Windows (Server 2019 HyperV)
Assigned to: MySQL Verification Team CPU Architecture:Any

[12 Mar 23:59] joseph nhcs
Description:
mysqld disappears several times a day during use hours, which are still not all that busy.  Doesn't seem to happen at night when use is slim.

The error is always the same, though the table file that is mentioned varies, so it doesn't seem to be related to which table is being accessed at the time.

2025-03-12T11:04:52.840143Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2025-03-12T11:04:56.251475Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2025-03-12T11:05:01.149422Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2025-03-12T11:05:01.149994Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2025-03-12T11:05:01.255758Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060
2025-03-12T11:05:01.256011Z 0 [System] [MY-010931] [Server] E:\MySQL\MySQL Server 8.0\bin\mysqld: ready for connections. Version: '8.0.41'  socket: ''  port: 3306  MySQL Community Server - GPL.
2025-03-12T11:06:55.857427Z 8 [Warning] [MY-013360] [Server] Plugin mysql_native_password reported: ''mysql_native_password' is deprecated and will be removed in a future release. Please use caching_sha2_password instead'
2025-03-12T15:54:55.571218Z 0 [ERROR] [MY-012646] [InnoDB] File .\edb\members_copy.ibd: 'aio write' returned OS error 0. Cannot continue operation
2025-03-12T15:54:55.571218Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: os0file.cc:6809:slot->is_reserved thread 6608
2025-03-12T15:54:55.572182Z 0 [ERROR] [MY-012981] [InnoDB] Cannot continue operation.
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
"2025-03-12T15:54:55Z UTC - mysqld got exception 0x16 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2025-03-12T16:00:14.903300Z 0 [Warning] [MY-010915] [Server] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' sql modes should be used with strict mode. They will be merged with strict mode in a future release."

How to repeat:
I have no idea what is causing it so I can't repeat it at will.  We have a utility that is copying thousands of records via ODBC periodically all day.  We've had the same basic setup for ~20 years.  This just started happening out of the blue on 3/6.

We were on 8.0.40.  We had been on that version for months without this issue.

Tried upgrading to .41; didn't help.

Tried a complete dump, chkdsk /r, initialize, restore, repair, optimize for the first time in years.  No improvement, though the data folder went down from ~10GB to ~4GB and our websites seem a little faster.

Chkdsk did find a few issues on E: and repaired them, but it didn't seem to have any effect on the crashing problem.

Tried moving data folder from E: to C:, though both are vhdxs, in case the E drive had issues.  It didn't help.

Tried some random tweaks to my.ini we found online for possibly similar seeming issues.  Didn't fix it, though maybe that's part of why the sites seem a little faster.

No other apps or services on our systems have shown any unusual behavior and there are no new hardware or system errors in Windows logs as far as we have found.
[13 Mar 17:40] MySQL Verification Team
Hi,

File .\edb\members_copy.ibd: 'aio write' returned OS error 0. Cannot continue operation

This is a system thing. MySQL got error from the Windows that it can't deal with that file. Nothing MySQL can do about it except report the error.

This can be number of things
 - antivirus
 - some new windows protection something something in new release
 - virus/malware
 - error on the hdd
 - error on the hdd controller (cache memory issue)
 - error on the motherboard (memory issue)
 - full disk
...

Anyhow, nothing we can do about it, we tried to access a file and windows said "no can do - error 0" and we passed that error back to you.

You can talk with our MySQL Support team to help you find what is exactly the problem and how to fix it but it is not a MySQL bug so nothing we can help you through this system

Good luck and thanks for using MySQL Server
[14 Mar 2:47] joseph nhcs
I understand, and appreciate your kindly and clear reply.  As unlikely as this may be, I wonder if it couldn't be MySQL misunderstanding or sending something malformed, since nothing else on the server is broken, and everything has been the same for years.

For what it's worth, and I understand that the list is just an illustration:
 - antivirus
We use Defender, as always, and it isn't showing any relevant activity.

 - some new windows protection something something in new release
We have disabled updates and haven't changed our version of Server 2019 in a year or more.

 - virus/malware
Possibly, but no other signs of it, and we run a lot of stuff which is all doing fine.

 - error on the hdd
Changed to a different HDD as I mentioned, and still had the problem.

 - error on the hdd controller (cache memory issue)
 - error on the motherboard (memory issue)
Of course these are hard to diagnose/find, but since nothing else is wrong and no other errors have occurred, and multiple workstations are on the same virtual host, and they are all fine, and the hardware is only a few years old and well maintained, it makes me doubt it.

 - full disk
Disk isn't nearly full fwiw.

I do get that the list isn't comprehensive but I wanted to show that we have done due diligence and I believe it is likely something like this:

1. Bug mishandling healthy busy virtual HD interaction due to lots of ODBC_32 activity, locking its own files on itself longer than it should, and crashing as if it were the OS's fault the files are locked
2. Something causing resources allotted to MySQL to momentarily run out
3. Some unusual configuration in our virtual server that causes file handling to not work as MySQL/innodb expects it to, but what would that be?  I have no idea.

Looking through our error log, I see "Not a valid bookmark" reported by the ODBC on doing an insert sometimes, on the same day that a crash happens.  Not sure if that would be related.  Could it be that when an ODBC insert fails, something doesn't unlock the file correctly, and the next time MySQL tries to use that file, it gets this error and crashes?

Would there be an updated driver, or something like that, to look into?

Is the lack of stack trace an indication that something is or isn't MySQL's fault?  

Something went terribly wrong, for sure. :)

Thanks!
[14 Mar 15:03] MySQL Verification Team
Hi,

It is possible that, in case of a virtualized environment, storage subsystem returns "error" if there is no more iops available for the VM. I'm not expert in how different vm controllers operate but it is a possibility. That would be something that Windows would have to deal with and delay response instead of returning an error.

When system returns an error we have to assume there is a real error. With disk errors we have to handle that seriously as it is 100000x safer to crash if that happens than to allow for data corruption. In any case MySQL is not directly communicating with hardware but talks to OS and if OS say there an error accessing device (storage in this case) there is nothing we should do differently.

If this is a case than it is a Windows bug. There's nothing we can do about it, and no, it is not possible we are sending something "wrong" to the Windows as that would be detected immediately by thousands of users even if it was pushed to the public, but it would hopefully be first caught by our QA team.

Thanks
[21 Mar 23:20] joseph nhcs
Thanks for your continued assistance.  I moved the data folder to a network share on another machine, and the errors (so far) seem to be gone.  Again, there are no other disk or similar errors on that VM, or any of the other VMs on that host.  The Windows logs show nothing.  

An anonymous error 0 doesn't give me a lot to go on.  Is there a way to get any more information from MySQL about what it is seeing and what exactly is failing, so we can try to troubleshoot further?  Since Windows shows nothing amiss, I think there may be no other way to figure this out.

I still lean toward the theory that MySQL could be fixed to work better on vhdx, or at least, report more details to try to help us out.  

As I understand it, this page is not visible because the team has not accepted that it is a MySQL bug.  As a result, anyone else like me experiencing trouble can't chime in or benefit from this conversation, and remains invisible unless they submit a bug too.  Is that right?  Is there another way to put a post somewhere that, if anyone else has any input on it, they could share?
[22 Mar 2:21] MySQL Verification Team
> As I understand it, this page is not visible because the team has not accepted that it is a MySQL bug.

No, quite the opposite.

>  Is there a way to get any more information from MySQL about what it is seeing and what exactly is failing

No, we pass-through the error we get from the OS.

Now there probably is a way to force Win to log this kind of errors in some kind of log but that is out of the scope of this bugs system and is a Win-OS thing, not a MySQL thing.

> I still lean toward the theory that MySQL could be fixed to work better on vhdx, 

Everything can always be made better... thing is that with disk io bound systems like databases there needs to be a decision made what's more important, verbosity increases lag.. also, do you want to retry IO operations if that means that maybe storage is dying and your data is maybe unsafe. Our decision is that if IO returns error it is safer to not store that data and fail that transaction than to retry and save it to unsafe storage. In theory this could be configurable, in practice it would make system less stable/durable.