| Bug #60173 | Partially written insert statement in binlog -- no errors reported | ||
|---|---|---|---|
| Submitted: | 18 Feb 2011 20:54 | Modified: | 30 May 2013 19:33 | 
| Reporter: | Mark Callaghan | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S2 (Serious) | 
| Version: | 5.1.47, 5.1.64, 5.5.27 | OS: | Any | 
| Assigned to: | CPU Architecture: | Any | |
| Tags: | corruption, Partial, replication | ||
   [18 Feb 2011 20:58]
   Mark Callaghan        
  I still don't know whether the transaction cache was used, but if it was the binlog_cache_size is 262144 and the corrupt event is ~4 MB so I assume the cache spilled to disk.
   [18 Feb 2011 21:14]
   Mark Callaghan        
  From the corrupt event, the size from start to end_log_pos is 4042650 # at 849548868 #110217 2:58:24 server id 10211982 end_log_pos 853591518 # Position Timestamp Type Master ID Size Master Pos Flags # 32a31644 50 ff 5c 4d 02 8e d2 9b 00 9a af 3d 00 de c5 e0 32 10 00
   [18 Feb 2011 21:48]
   Mark Callaghan        
  My other theory is that the host ran out out of memory, malloc failed and mysqld ignored/missed the failure. I had one obscure bug in replication internals from that a few years back. But I have no evidence this host was near OOM.
   [25 Feb 2011 22:38]
   Sveta Smirnova        
  Thank you for the report. Could you please send us binary log with corrupted event?
   [26 Mar 2011 0:00]
   Bugs System        
  No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
   [11 Jul 2012 17:59]
   Mark Callaghan        
  I cannot send you the binlog with the corruption. But you can have an engineer read the code to confirm that errors from my_b_fill are ignored, even in 5.1.63 and 5.6 m8
   [13 Jul 2012 19:38]
   Sveta Smirnova        
  Thank you for the feedback. Verified by examining source code. I could not repeat corruption of binary log.
   [13 Jul 2012 19:39]
   Sveta Smirnova        
  In trunk my_b_fill does not exists.
   [30 May 2013 19:33]
   Daniel So        
  Route to DOCUMENTING as per status on Oracle Bug system.
   [30 May 2013 19:33]
   Daniel So        
  Added changelog entry in reference manuals of versions 5.1.70,5.5.32, 5.6.12, 5.7.2: 'The binary log contents got corrupted sometimes, because the "MYSQL_BIN_LOG::write_cache" function always thought it had reached the "end-of-cache" when the function "my_b_fill()" reported a "0," while that could also mean an error had occurred. This fix makes sure that whenever my_b_fill() returns a "0," an error check is performed on "info->error."'

Description: I have a binlog with: BEGIN, INSERT, COMMIT (XID). The insert statement was partially written. No errors were reported and the XID event was written so I don't think mysqld noticed an error. I don't know yet whether this was an auto-commit transaction. I started to read the code and noticed that in MYSQL_BIN_LOG::write_cache code assumes that 0 returned from my_b_fill always means end-of-cache: } while ((length= my_b_fill(cache))); DBUG_ASSERT(carry == 0); return 0; // All OK } But that is incorrect: /* Fill buffer of the cache. NOTES This assumes that you have already used all characters in the CACHE, independent of the read_pos value! RETURN 0 On error or EOF (info->error = -1 on error) # Number of characters */ size_t my_b_fill(IO_CACHE *info) And other callers of my_b_fill don't check for error: my_b_copy_to_file, maybe my_b_gets, How to repeat: read the code Suggested fix: check for errors