Bug #41898 Totalbytes in backup history log has same size for compressed/uncompressed data
Submitted: 6 Jan 2009 17:18 Modified: 27 Oct 2009 23:26
Reporter: Hema Sridharan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Backup Severity:S3 (Non-critical)
Version:6.0 OS:Linux
Assigned to: Chuck Bell CPU Architecture:Any

[6 Jan 2009 17:18] Hema Sridharan
Description:
Create database db1
Create tables and load with data.
Execute backup operation with and without compression.
Verify the total_bytes in mysql.backup_history table for compressed and uncompressed data. The total_bytes shows the same size for both the data.

Create database db1;
create table db1.t1(id int);
insert into db1.t1 values(10),(20),(30),(40),(50);
insert into db1.t1 select * from db1.t1;
insert into db1.t1 select * from db1.t1;
insert into db1.t1 select * from db1.t1;
backup database db1 to 'db1.bak';
backup database db1 to 'db1.bak.gz' with compression;
Verify total bytes from mysql.backup_history

How to repeat:
mysql> create database db1;
Query OK, 1 row affected (0.00 sec)

mysql> create table db1.t1(id int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into db1.t1 values(10),(20),(30),(40),(50);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into db1.t1 select * from db1.t1;
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into db1.t1 select * from db1.t1;
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> insert into db1.t1 select * from db1.t1;
Query OK, 20 rows affected (0.00 sec)
Records: 20  Duplicates: 0  Warnings: 0

mysql> backup database db1 to 'db1.bak';
+-----------+
| backup_id |
+-----------+
| 270       |
+-----------+
1 row in set (0.18 sec)

mysql> backup database db1 to 'db1.bak.gz' with compression;
+-----------+
| backup_id |
+-----------+
| 271       |
+-----------+
1 row in set (0.05 sec)

mysql> select total_bytes, backup_file, command from mysql.backup_history\G
*************************** 1. row ***************************
total_bytes: 1306
backup_file: db1.bak
    command: backup database db1 to 'db1.bak'
*************************** 2. row ***************************
total_bytes: 1306
backup_file: db1.bak.gz
    command: backup database db1 to 'db1.bak.gz' with compression
2 rows in set (0.00 sec)

From the above table, we can notice that total_bytes(1306) shown is the same for compressed and uncompressed data.

Suggested fix:
The total bytes shown in the logs should show total number of bytes written to disk.
[6 Jan 2009 17:42] MySQL Verification Team
Thank you for the bug report. Verified as described.
[3 Jul 2009 12:54] Jørgen Løland
The total_bytes column does not contain the total number of bytes written to disk, but rather the number of bytes read from the backup drivers:

1615 2009-07-03 14:41 db1.bak
 376 2009-07-03 14:41 db1.bak.gz

Neither of these are equal to "total_bytes: 1306" as shown above.

What should total_bytes really print?
 a) The number of bytes sent from the backup kernel to the bstream_library?
 b) The number of bytes sent from the bstream_library to the underlying I/O?
 c) The number of bytes actually written by the underlying I/O?
 d) Something else?

The backup code only has complete control of the information in a) and b). So what makes sense if the underlying I/O does compression of it self?
[3 Jul 2009 13:02] Jørgen Løland
Another alternative: 
 e) The total number of bytes received from the backup drivers
[7 Jul 2009 19:49] Chuck Bell
Meeting Notes 7 July 2009

Attendees
---------
Ingo, Hema, Rafal, Chuck

Discussion
----------
What value should 'total_bytes' have when printed?

Decision
--------
The group was in favor (3:1) for adopting option c) as listed in the bug report with the following clarification:

 c) The number of bytes actually written by the underlying I/O?
    Which means the total_bytes would be equal to the file size of the
    backup image.

The group believes option c) includes bytes written to disk (in the case of writing to a file) taking into account compression and encryption (when added). Additionally, sending data to XBSA and writing to pipes were discussed but it is not clear if these can be measured in the same way. Thus, XBSA and writing to pipes may need to be investigated when the solution is implemented. Lastly, the exact limitations will be determined by the developer and based on the accepted implementation.
[12 Aug 2009 15:14] Chuck Bell
Chuck steals another bug...Lars made me a list. :)
[13 Aug 2009 0:33] Chuck Bell
The solution shall accurately record the total_bytes in the backup_history log as the file size on backup. If compression is used, the total_bytes shall be the same as the file size.

During restore, the total_bytes shall be the total number of bytes read from the file after decompression if compression is used. Thus, total_bytes reported for a restore shall be the uncompressed file size.
[13 Aug 2009 6:20] Jørgen Løland
This means that 

BACKUP DATABASE x to '1.bak' WITH COMPRESSION;

and

RESTORE FROM '1.bak' OVERWRITE;

will produce different total_byte sizes. Is this intentional?
[13 Aug 2009 14:40] Chuck Bell
There is a bug in the zstream library is why the read of the uncompressed bytes is incorrect but the inflated bytes read is correct.

I can code around this if it makes sense, but I'd rather have the bug fixed first. I will decide as per the team decision on how to handle this.
[13 Aug 2009 14:46] Chuck Bell
Ah...spoke too quickly. I see now how to get around the problem. 

The column 'total_bytes' will now reflect the actual bytes read from the file, not the uncompressed bytes during the restore of a compressed backup image.
[13 Aug 2009 14:54] Chuck Bell
The decision recorded in this bug report also satisfies BUG#39780. I have marked BUG#39780 as a duplicate of BUG#41898.
[14 Aug 2009 0:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80802

2860 Chuck Bell	2009-08-13
      BUG#41898 : Totalbytes in backup history log has same size for compressed/uncompressed data
      
      This bug and BUG#39780 which is a duplicate reports the problem that the
      'total_bytes' column of the backup_history log reports and incorrect and
      unrealistic value.
      
      This patch corrects the problem by making the 'total_bytes' column store 
      the number of bytes read or written. This means this column is now the
      same as the backup image file size.
     @ mysql-test/suite/backup/include/check_filesize.inc
        New include for comparing total_bytes with file size.
     @ mysql-test/suite/backup/r/backup_compression.result
        New result file with comment removed and correct result displayed.
     @ mysql-test/suite/backup/r/backup_intr_errors.result
        Corrected result file -- new location of close method in kernel.cc
        caused a different order for the error messages.
     @ mysql-test/suite/backup/r/backup_log_filesize.result
        New result file.
     @ mysql-test/suite/backup/r/backup_logs.result
        Corrected result file now that the bug is fixed.
     @ mysql-test/suite/backup/t/backup_compression.test
        Removed comment about bug report.
     @ mysql-test/suite/backup/t/backup_log_filesize.test
        New test to demonstrate the total_bytes column of the backup_history
        log corresponds to the backup image file size.
     @ sql/backup/data_backup.cc
        Count bytes written for each pass throgh the method.
     @ sql/backup/image_info.h
        Changed to larger size to avoid buffer overflow for large files.
     @ sql/backup/kernel.cc
        Moved report_stats_post to close method because the stream close
        method writes bytes for backup and these must be counted for an
        accurate byte count.
     @ sql/backup/logger.cc
        Changed implementation to use a ulonglong instead of Image_info because
        that object is not available when this method needs to be called. This
        method must be called after the stream close and before the method that
        reports completed operation. See kernel.cc for these modifications.
     @ sql/backup/logger.h
        Changed reporter for stats post to accept the bytes as ulonglong because
        the info class is not available where this method must now be called.
     @ sql/backup/stream.cc
        Added code to total bytes read and written.
     @ sql/backup/stream.h
        Changed size of bytes read/written to avoid buffer overflow.
[14 Aug 2009 19:22] Chuck Bell
Correction: BUG#37980 is a duplicate.
[20 Aug 2009 0:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81119

2860 Chuck Bell	2009-08-19
      BUG#41898 : Totalbytes in backup history log has same size for compressed/uncompressed data
      
      This bug and BUG#39780 which is a duplicate reports the problem that the
      'total_bytes' column of the backup_history log reports and incorrect and
      unrealistic value.
      
      This patch corrects the problem by making the 'total_bytes' column store 
      the number of bytes read or written. This means this column is now the
      same as the backup image file size.
     @ mysql-test/suite/backup/include/check_filesize.inc
        New include for comparing total_bytes with file size.
     @ mysql-test/suite/backup/r/backup_compression.result
        New result file with comment removed and correct result displayed.
     @ mysql-test/suite/backup/r/backup_intr_errors.result
        Corrected result file -- new location of close method in kernel.cc
        caused a different order for the error messages.
     @ mysql-test/suite/backup/r/backup_log_filesize.result
        New result file.
     @ mysql-test/suite/backup/r/backup_logs.result
        Corrected result file now that the bug is fixed.
     @ mysql-test/suite/backup/t/backup_compression.test
        Removed comment about bug report.
     @ mysql-test/suite/backup/t/backup_log_filesize.test
        New test to demonstrate the total_bytes column of the backup_history
        log corresponds to the backup image file size.
     @ sql/backup/kernel.cc
        Moved report_stats_post to close method because the stream close
        method writes bytes for backup and these must be counted for an
        accurate byte count.
     @ sql/backup/logger.h
        Added method to abstract writing of backup history.
     @ sql/backup/stream.cc
        Added code to total bytes read and written.
[20 Aug 2009 0:40] Chuck Bell
CLARIFICATION
-------------
The column total_bytes shall present the exact number of bytes written to or read from a file. These shall be actual bytes. For example, if a compressed file is written then total_bytes shall be the actual, compressed file size.
[20 Aug 2009 13:14] Chuck Bell
Redoing patch under protest.
[20 Aug 2009 14:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81175

2862 Chuck Bell	2009-08-20
      BUG#41898 : Totalbytes in backup history log has same size for compressed/uncompressed data
      
      This bug and BUG#39780 which is a duplicate reports the problem that the
      'total_bytes' column of the backup_history log reports and incorrect and
      unrealistic value.
      
      This patch corrects the problem by making the 'total_bytes' column store 
      the number of bytes read or written. This means this column is now the
      same as the backup image file size.
     @ mysql-test/suite/backup/include/check_filesize.inc
        New include for comparing total_bytes with file size.
     @ mysql-test/suite/backup/r/backup_compression.result
        New result file with comment removed and correct result displayed.
     @ mysql-test/suite/backup/r/backup_intr_errors.result
        Corrected result file -- new location of close method in kernel.cc
        caused a different order for the error messages.
     @ mysql-test/suite/backup/r/backup_log_filesize.result
        New result file.
     @ mysql-test/suite/backup/r/backup_logs.result
        Corrected result file now that the bug is fixed.
     @ mysql-test/suite/backup/t/backup_compression.test
        Removed comment about bug report.
     @ mysql-test/suite/backup/t/backup_log_filesize.test
        New test to demonstrate the total_bytes column of the backup_history
        log corresponds to the backup image file size.
     @ sql/backup/image_info.h
        Changed data type to allow for large file sizes.
     @ sql/backup/kernel.cc
        Moved report_stats_post to close method because the stream close
        method writes bytes for backup and these must be counted for an
        accurate byte count.
     @ sql/backup/logger.h
        Added method to abstract writing of backup history.
     @ sql/backup/stream.cc
        Added code to total bytes read and written.
     @ sql/backup/stream.h
        Changed data type to allow for large file sizes.
[20 Aug 2009 15:10] Rafal Somla
Approved pending fix of a comment.
[21 Aug 2009 7:41] Jørgen Løland
Patch approved pending minor changes as described in email.
[21 Aug 2009 15:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81327

2862 Chuck Bell	2009-08-21
      BUG#41898 : Totalbytes in backup history log has same size for compressed/uncompressed data
      
      This bug and BUG#39780 which is a duplicate reports the problem that the
      'total_bytes' column of the backup_history log reports and incorrect and
      unrealistic value.
      
      This patch corrects the problem by making the 'total_bytes' column store 
      the number of bytes read or written. This means this column is now the
      same as the backup image file size.
     @ mysql-test/suite/backup/include/check_filesize.inc
        New include for comparing total_bytes with file size.
     @ mysql-test/suite/backup/r/backup_compression.result
        New result file with comment removed and correct result displayed.
     @ mysql-test/suite/backup/r/backup_intr_errors.result
        Corrected result file -- new location of close method in kernel.cc
        caused a different order for the error messages.
     @ mysql-test/suite/backup/r/backup_log_filesize.result
        New result file.
     @ mysql-test/suite/backup/r/backup_logs.result
        Corrected result file now that the bug is fixed.
     @ mysql-test/suite/backup/t/backup_compression.test
        Removed comment about bug report.
     @ mysql-test/suite/backup/t/backup_log_filesize.test
        New test to demonstrate the total_bytes column of the backup_history
        log corresponds to the backup image file size.
     @ mysql-test/suite/backup_engines/include/backup_restore_interrupt.inc
        Changed masking of error messages to mask out 'Error on delete' comment
        in the log.
     @ sql/backup/image_info.h
        Changed data type to allow for large file sizes.
     @ sql/backup/kernel.cc
        Moved report_stats_post to close method because the stream close
        method writes bytes for backup and these must be counted for an
        accurate byte count.
     @ sql/backup/logger.h
        Added method to abstract writing of backup history.
     @ sql/backup/stream.cc
        Added code to total bytes read and written.
     @ sql/backup/stream.h
        Changed data type to allow for large file sizes.
[21 Aug 2009 15:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81328

2863 Chuck Bell	2009-08-21 [merge]
      Local merge before push of BUG#41898
[25 Oct 2009 13:38] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091025133616-ca4inerav4vpdnaz) (version source revid:ingo.struewing@sun.com-20090908195642-dtq0vxjcjk6e11w4) (merge vers: 5.4.4-alpha) (pib:13)
[27 Oct 2009 23:26] Paul DuBois
Clarified the total_bytes meaning in the MySQL Backup documentation.