MySQL Bugs: #107403: Improve error reporting of failures in open(2) to include errno + description

Bug #107403	Improve error reporting of failures in open(2) to include errno + description
Submitted:	26 May 2022 10:29	Modified:	26 May 2022 10:53
Reporter:	Simon Mudd (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Logging	Severity:	S4 (Feature request)
Version:	8.0.28	OS:	CentOS (CentOS 8 Stream)
Assigned to:		CPU Architecture:	Any
Tags:	disk full, encryption error, errno, read-only

Description:
I got this stack trace from mysqld:

2022-05-25T16:24:32.425017Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 5456ms to flush 1525 and evict 0 pages
2022-05-25T16:24:37.571797Z 16842803 [ERROR] [MY-010833] [Server] MYSQL_BIN_LOG::open_crash_safe_index_file failed to open temporary index file.
2022-05-25T16:24:37.571858Z 16842803 [ERROR] [MY-010835] [Server] MYSQL_BIN_LOG::add_log_to_index failed to open the crash safe index file.
2022-05-25T16:24:37.572036Z 16842803 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: Either disk is full, file system is read only or there was an encryption error while openi
ng the binlog. Aborting the server..
16:24:37 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7f5d58061550
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f715044d698 thread_stack 0x100000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x205aad1]
/usr/sbin/mysqld(print_fatal_signal(int)+0x2f3) [0xf07653]
/usr/sbin/mysqld(my_server_abort()+0x7e) [0xf0779e]
/usr/sbin/mysqld(my_abort()+0xe) [0x205472e]
/usr/sbin/mysqld() [0x1c60a39]
/usr/sbin/mysqld(MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int)+0xd89) [0x1c712e9]
/usr/sbin/mysqld(MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*)+0x837) [0x1c72647]
/usr/sbin/mysqld(MYSQL_BIN_LOG::after_write_to_relay_log(Master_info*)+0x1e0) [0x1c72ca0]
/usr/sbin/mysqld(queue_event(Master_info*, char const*, unsigned long, bool)+0x629) [0x1d54019]
/usr/sbin/mysqld(handle_slave_io+0xda5) [0x1d55f05]
/usr/sbin/mysqld() [0x25b60a4]
/lib64/libpthread.so.0(+0x81cf) [0x7f71904c21cf]
/lib64/libc.so.6(clone+0x43) [0x7f718e867d83]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 16842803
Status: NOT_KILLED

This error message is unhelpful as it provides 3 different causes of the problem, yet the solution for each of these might be different.

How to repeat:
Fill up your disk and see if you can trigger the error report.
I *think* that was the cause but am not absolutely sure which is where this FR comes from.  It would be helpful if the error report was more detailed.

Suggested fix:
I notice from https://man7.org/linux/man-pages/man2/open.2.html that the open() call probably should set errno if the call fails. It might be worthwhile retrieving that value and sharing it in the error message as it would provide more details of the cause of the problem which will help any administrator resolve the crash more quickly.

So please consider retrieving errno, and maybe providing the text description the OS provides for errno and including this in the error message returned to the user.

adjust tags.

Hello Simon,

Thank you for the feature request!

regards,
Umesh