Bug #52768 Slave_SQL_Errno is sometimes set to my_errno
Submitted: 12 Apr 2010 16:13 Modified: 3 Jul 2012 9:04
Reporter: Sven Sandberg Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1+, 5.6.99 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: Slave_SQL_Errno
Triage: Triaged: D4 (Minor)

[12 Apr 2010 16:13] Sven Sandberg
Description:
When an error occurs in the slave's SQL thread, this is manifested in the code by calling rli->report(). This causes the Slave_SQL_Error and Slave_SQL_Errno columns of SHOW SLAVE STATUS to display the reason for the error. Normally, the error number is one of the usual constants ER_* defined in sql/share/errmsg.txt, and the error message is the corresponding string. This is all good.

However, in some cases we set Slave_SQL_Errno to something else than an ER_* number, and sometimes we set Slave_SQL_Error to a hard-coded error message instead of a translatable string from sql/share/errmsg.txt. Specifically, we use my_errno instead of ER_* in several places, such as this line in Append_block_log_event::do_apply_event():

      rli->report(ERROR_LEVEL, my_errno,
                  "Error in %s event: could not create file '%s'",
                  get_type_str(), fname);

It is inconsistent to mix error codes from different domains (my_errno vs ER_*). It is also undocumented: at http://dev.mysql.com/doc/refman/5.1/en/error-messages-server.html there is no hint that any other error codes may be used. The my_errno numbers are platform-dependent. The hard-coded strings don't get translated.

How to repeat:
Read code, or read rpl_slave_load_remove_tmpfile.result:

[...]
Last_SQL_Errno	9
Last_SQL_Error	Error in Begin_load_query event: write to '../../tmp/SQL_LOAD.data' failed
[...]

Suggested fix:
1. Refactor Slave_reporting_capabilities::report so that it doesn't allow custom strings. Instead, it should take an ER_* code as argument and get the translated string from sql/share/errmsg.txt.

2. For each call to Slave_reporting_capabilities, check if an existing ER_* code can be used. If not, invent a new error code.
[13 Apr 2010 5:08] Sveta Smirnova
Thank you for the report.

Verified as described.
[3 Jul 2012 9:03] Jon Stephens
Fixed in trunk, tagged 5.7.0. Documented in the 5.7.0 changelog as follows:

        When an error occurs in the slave SQL thread, this causes the
        Slave_SQL_Error and Slave_SQL_Errno columns of SHOW SLAVE STATUS
        to display the reason for the error. The error number should be
        one of the usual constants ER_* defined in sql/share/errmsg.txt,
        and the error message should be the corresponding string.
        However, in some cases, Slave_SQL_Errno was set to something
        else than an ER_* number, and Slave_SQL_Error to a hard-coded
        error message rather than a translatable string from
        sql/share/errmsg.txt. Now all errors shown by SHOW SLAVE STATUS
        originate from sql/share/errmsg.txt, as expected.

Closed.