Bug #43535 last_io_error in show slave status is confusing
Submitted: 10 Mar 2009 14:01 Modified: 19 Nov 2010 16:13
Reporter: Matthew Lord Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:5.1 OS:Any
Assigned to: Luis Soares CPU Architecture:Any
Tags: last_io_error, show slave status

[10 Mar 2009 14:01] Matthew Lord
Description:
Currently the last_io_error column in SHOW SLAVE STATUS causes quite a 
bit of confusion.  Firstly the error contains no DATETIME so you
have no idea when the problem occurred:
Last_IO_Error: error connecting to master 'foo@boo' - retry-time: 60 retries: 86400

Lastly there is no way to reset this error so it just stays around 
"forever" or possibly gets reset when another IO error occurs.  You 
cannot tell if the error is new or not because there is no DATETIME
anywhere.

How to repeat:
n/a

Suggested fix:
1) We should include a DATETIME value somewhere in the error text
2) We should provide a way to clear any previous error information 
   from the SHOW SLAVE STATUS output.
[16 Mar 2009 11:05] Vygintas Krasauskas
I second it!

This morning I spent a half of hour searching for a reason for the "Relay log write failure" error message and wondering the replication is running. The I've found that the message has been issued several days ago and is not tolical anymore.

Re "2)": "Beginning with MySQL 5.1.32, issuing RESET MASTER or RESET SLAVE resets the values shown in these columns."
[5 Apr 2010 17:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105032

3183 Luis Soares	2010-04-05
      BUG#43535: last_io_error in show slave status is confusing
      
      Although IO and SQL thread errors numbers and messages are
      reported in SHOW SLAVE STATUS output, the correspondent timestamp
      at which they have occurred is not. As such the user may be left
      wondering whenever the error happened (NOTE: actually, in the
      error log, a timestamp should be printed along with the error
      message, however in SHOW SLAVE STATUS it is not currently).
      
      Furthermore, at the time this bug was reported, there was no way
      for the user to clean the error number and message reported.
      
      This patch adds two new columns to the output of SHOW SLAVE
      STATUS:
      
        - Last_IO_Error_Timestamp - shows the timestamp for the last
          error IO error
      
        - Last_SQL_Error_Timestamp - the same as
          Last_IO_Error_Timestamp, but for the SQL thread errors.
      
      The timestamp is formatted as : "YYYY-MM-DD HH:MM:SS".  
      
      The computation of the timestamp is added as a new method of the
      Error class (update_timestamp), which in turn is an inner class
      of the Slave_reporting_capability. This new method is called
      everytime Slave_reporting_capability::report is invoked with log
      level set to ERROR_LEVEL, thence capturing the timestamp of the
      error being reported.
      
      Starting from 5.1.37 (ie, when the fix for BUG 44270 was
      released), the fields are cleared on RESET SLAVE. Furthermore,
      the test case added in this patch checks that: (1) the error
      message(s) is not reset on explicit stop slave after an error
      occurred; (2) the error message(s) is reset when slave recovers
      from the error (ie, when the DBA fixes the problem and resumes
      replication by issuing START SLAVE). Moreover, clearing the new
      Timestamp fields is done in the Error class clear() method, 
      therefore when Last_*_Errno and Last_*_Error are reset,
      Last_*_Error_Timestamp is gets reset also.
      
      Finally, this patch accommodates the changes needed for existing
      MTR server tests:
      
        - result files are updated due to the new _Timestamp columns
      
        - show_slave_status.inc include files are augmented with
          replace_regex to mask out the timestamp from result files
      
        - some tests are also changed to mask out the output for the
          new _Timestamp columns
     @ sql/rpl_reporting.cc
        Deployed call to update_timestamp() method whenever an error
        is reported.
     @ sql/rpl_reporting.h
        Adds timestamp field to the Error class.
        Adds cleaning the timestamp field to the clear() method.
        Adds update_timestamp method (calculates a timestamp and 
        sets it to the newly added timestamp field).
     @ sql/slave.cc
        Adds last_error().timestamp to the output of SHOW SLAVE STATUS.
[5 Apr 2010 17:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105033

3183 Luis Soares	2010-04-05
      BUG#43535: last_io_error in show slave status is confusing
      
      Although IO and SQL thread errors numbers and messages are
      reported in SHOW SLAVE STATUS output, the correspondent timestamp
      at which they have occurred is not. As such the user may be left
      wondering whenever the error happened (NOTE: actually, in the
      error log, a timestamp should be printed along with the error
      message, however in SHOW SLAVE STATUS it is not currently).
      
      Furthermore, at the time this bug was reported, there was no way
      for the user to clean the error number and message reported.
      
      This patch adds two new columns to the output of SHOW SLAVE
      STATUS:
      
        - Last_IO_Error_Timestamp - shows the timestamp for the last
          error IO error
      
        - Last_SQL_Error_Timestamp - the same as
          Last_IO_Error_Timestamp, but for the SQL thread errors.
      
      The timestamp is formatted as : "YYYY-MM-DD HH:MM:SS".  
      
      The computation of the timestamp is added as a new method of the
      Error class (update_timestamp), which in turn is an inner class
      of the Slave_reporting_capability. This new method is called
      everytime Slave_reporting_capability::report is invoked with log
      level set to ERROR_LEVEL, thence capturing the timestamp of the
      error being reported.
      
      Starting from 5.1.37 (ie, when the fix for BUG 44270 was
      released), the fields are cleared on RESET SLAVE. Furthermore,
      the test case added in this patch checks that: (1) the error
      message(s) is not reset on explicit stop slave after an error
      occurred; (2) the error message(s) is reset when slave recovers
      from the error (ie, when the DBA fixes the problem and resumes
      replication by issuing START SLAVE). Moreover, clearing the new
      Timestamp fields is done in the Error class clear() method, 
      therefore when Last_*_Errno and Last_*_Error are reset,
      Last_*_Error_Timestamp is gets reset also.
      
      Finally, this patch accommodates the changes needed for existing
      MTR server tests:
      
        - result files are updated due to the new _Timestamp columns
      
        - show_slave_status.inc include files are augmented with
          replace_regex to mask out the timestamp from result files
      
        - some tests are also changed to mask out the output for the
          new _Timestamp columns
     @ sql/rpl_reporting.cc
        Deployed call to update_timestamp() method whenever an error
        is reported.
     @ sql/rpl_reporting.h
        Adds timestamp field to the Error class.
        Adds cleaning the timestamp field to the clear() method.
        Adds update_timestamp method (calculates a timestamp and 
        sets it to the newly added timestamp field).
     @ sql/slave.cc
        Adds last_error().timestamp to the output of SHOW SLAVE STATUS.
[16 Apr 2010 14:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105892

3183 Luis Soares	2010-04-16
      BUG#43535: last_io_error in show slave status is confusing
      
      Although IO and SQL thread errors numbers and messages are
      reported in SHOW SLAVE STATUS output, the correspondent timestamp
      at which they have occurred is not. As such the user may be left
      wondering whenever the error happened (NOTE: actually, in the
      error log, a timestamp should be printed along with the error
      message, however in SHOW SLAVE STATUS it is not currently).
      
      Furthermore, at the time this bug was reported, there was no way
      for the user to clean the error number and message reported.
      
      This patch adds two new columns to the output of SHOW SLAVE
      STATUS:
      
        - Last_IO_Error_Timestamp - shows the timestamp for the last
          error IO error
      
        - Last_SQL_Error_Timestamp - the same as
          Last_IO_Error_Timestamp, but for the SQL thread errors.
      
      The timestamp is formatted as : "YYYY-MM-DD HH:MM:SS".  
      
      The computation of the timestamp is added as a new method of the
      Error class (update_timestamp), which in turn is an inner class
      of the Slave_reporting_capability. This new method is called
      everytime Slave_reporting_capability::report is invoked with log
      level set to ERROR_LEVEL, thence capturing the timestamp of the
      error being reported.
      
      Starting from 5.1.37 (ie, when the fix for BUG 44270 was
      released), the fields are cleared on RESET SLAVE. Furthermore,
      the test case added in this patch checks that: (1) the error
      message(s) is not reset on explicit stop slave after an error
      occurred; (2) the error message(s) is reset when slave recovers
      from the error (ie, when the DBA fixes the problem and resumes
      replication by issuing START SLAVE). Moreover, clearing the new
      Timestamp fields is done in the Error class clear() method, 
      therefore when Last_*_Errno and Last_*_Error are reset,
      Last_*_Error_Timestamp is gets reset also.
      
      Finally, this patch accommodates the changes needed for existing
      MTR server tests:
      
        - result files are updated due to the new _Timestamp columns
      
        - show_slave_status.inc include files are augmented with
          replace_regex to mask out the timestamp from result files
      
        - some tests are also changed to mask out the output for the
          new _Timestamp columns
     @ mysql-test/suite/rpl/t/rpl_show_errors.test
        Test case added for the new field in SHOW SLAVE STATUS.
     @ sql/rpl_reporting.cc
        Deployed call to update_timestamp() method whenever an error
        is reported.
     @ sql/rpl_reporting.h
        Adds timestamp field to the Error class.
        Adds cleaning the timestamp field to the clear() method.
        Adds update_timestamp method (calculates a timestamp and 
        sets it to the newly added timestamp field).
     @ sql/slave.cc
        Adds last_error().timestamp to the output of SHOW SLAVE STATUS.
[21 Apr 2010 17:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106282

3183 Luis Soares	2010-04-21
      BUG#43535: last_io_error in show slave status is confusing
            
      Although IO and SQL thread errors numbers and messages are reported in
      SHOW SLAVE STATUS output, the correspondent timestamp at which they
      have occurred is not. As such the user may be left wondering whenever
      the error happened (NOTE: actually, in the error log, a timestamp
      should be printed along with the error message, however in SHOW SLAVE
      STATUS it is not currently).
            
      Furthermore, at the time this bug was reported, there was no way for
      the user to clean the error number and message reported.
            
      This patch adds a timestamp, similar to the one found in the error
      log, to the Last_*_Error columns. It appends it at the beginning of
      the error message, as in (square brackes not included):
      
       "Last_IO_Error    YYMMDD hh:mm:ss [PLAIN OLD ERROR MESSAGE ...]"
           
      The computation of the timestamp is added as a new method of the Error
      class (update_timestamp), which in turn is an inner class of the
      Slave_reporting_capability. This new method is called everytime
      Slave_reporting_capability::report is invoked with log level set to
      ERROR_LEVEL, thence capturing the timestamp of the error being
      reported.
            
      Starting from 5.1.37 (ie, when the fix for BUG 44270 was released),
      the fields are cleared on RESET SLAVE. Furthermore, the test case
      added in this patch checks that: (1) the error message(s) is not reset
      on explicit stop slave after an error occurred; (2) the error
      message(s) is reset when slave recovers from the error (ie, when the
      DBA fixes the problem and resumes replication by issuing START
      SLAVE). Moreover, clearing the new Timestamp fields is done in the
      Error class clear() method, therefore when Last_*_Errno and
      Last_*_Error are reset, Last_*_Error_Timestamp is gets reset also.
            
      Finally, this patch accommodates the changes needed for existing MTR
      server tests:
            
       - result files are updated due to the new _Timestamp columns
            
       - show_slave_status.inc include files are augmented with
         replace_regex to mask out the timestamp from result files
            
       - some tests are also changed to mask out the output for the
         new _Timestamp columns
     @ mysql-test/suite/rpl/t/rpl_show_errors.test
        Test case added for checking the timestamp on error messages from 
        SHOW SLAVE STATUS command.
     @ sql/rpl_reporting.cc
        Deployed call to update_timestamp() method whenever an error
        is reported.
     @ sql/rpl_reporting.h
        Adds timestamp field to the Error class.
        Adds cleaning the timestamp field to the clear() method.
        Adds update_timestamp method (calculates a timestamp and 
        sets it to the newly added timestamp field).
     @ sql/slave.cc
        Appends last_error().timestamp to the beginning of the output for 
        Last_IO/SQL_Error in SHOW SLAVE STATUS.
[6 May 2010 16:29] Luis Soares
Queued in mysql-5.1-rep+3:
http://pb2.norway.sun.com/web.py?template=push_details&push=1227938
[30 Jul 2010 15:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/114771

3012 Luis Soares	2010-07-30
      BUG#43535: last_io_error in show slave status is confusing
      
      Post-push fix.
      
      Small improvement on the test case to make it more robust,
      ie, tolerate races between session thread and IO/SQL threads
      while cleaning up the error message after a slave restart.
[16 Aug 2010 6:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:26] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[18 Nov 2010 16:34] Jon Stephens
Documented bugfix in the 5.6.1 changelog as follows:

        Two new columns have been added to the output of SHOW SLAVE
        STATUS for reporting when the most recent I/O and SQL thread
        errors occurred. The Last_IO_Error_Timestamp column shows the
        timestamp for the most recent I/O error, and
        Last_SQL_Error_Timestamp shows the timestamp for the most recent
        SQL thread error. Both of these columns use DATETIME values to
        represent the timestamps. For more information, see "SHOW SLAVE 
        STATUS Syntax".

Also updated the indicated section of the 5.6 Manual.

Closed.
[18 Nov 2010 19:22] Jon Stephens
Ran into issues while importing unrelated docs into repo, causing these changes to get wiped out... Set bug report back to Documenting until I'm able to re-commit.
[19 Nov 2010 16:13] Jon Stephens
Changelog entry previously copied here was not completely accurate (no new columns have been added, just the timestamps). Documented in the 5.6.1 changelog as follows:

        Timestamps have been added to the output of SHOW SLAVE
        STATUS for reporting when the most recent I/O and SQL thread
        errors occurred. The Last_IO_Error column is now prefixed with 
        the timestamp for the most recent I/O error, and
        Last_SQL_Error shows the timestamp for the most recent
        SQL thread error. The timestamp values use the
        format YYMMDD HH:MM:SS in both of these columns. For more 
        information, see "Show SLAVE STATUS Syntax". 

Also updated the indicated section of the 5.6 Manual.

Closed.
[10 Mar 2011 16:47] Jon Stephens
See BUG#58584.