Bug #37980 Incorrect backup file size in online backup table.
Submitted: 9 Jul 2008 3:52 Modified: 13 Aug 2009 14:53
Reporter: Hema Sridharan Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Backup Severity:S3 (Non-critical)
Version:mysql-6.0-backup OS:Linux
Assigned to: Rafal Somla CPU Architecture:Any

[9 Jul 2008 3:52] Hema Sridharan
Description:
1) Create database and tables.
2) Perform backup and check the size of backup file.
This size varies from the size that is shown in online backup table.
 

How to repeat:
 The following table shows the actual backup file size against the one that is reported in the online backup_table. I have performed some iterations with different backup sizes with compression as well as without compression.
 
    Before compression  After Compression       Online backup_table
 
1)     846MB                 54MB                730478924bytes ~ 696MB
2)     158MB                 10MB                136023748bytes ~ 130MB 
3)     1.1GB                 69MB                932102360bytes ~ 888MB  

The online backup table shows the same size for both compressed data and  uncompressed data.   

mysql>select * from mysql.online_backup\G
*************************** 3. row ***************************
          backup_id: 3
         process_id: 0
         binlog_pos: 0
        binlog_file: NULL
       backup_state: complete
          operation: backup
          error_num: 0
        num_objects: 1
        total_bytes: 730478924
validity_point_time: 2008-07-08 15:21:13
         start_time: 2008-07-08 15:21:12
          stop_time: 2008-07-08 15:24:49
host_or_server_name: localhost
           username: root
        backup_file: /export/home/tmp/backup_dmp/dbb1.bak.gz
       user_comment:
            command: backup database dbb to '/export/home/tmp/backup_dmp/dbb1.bak.gz' 

with compression
            engines: Snapshot
*************************** 4. row ***************************
          backup_id: 4
         process_id: 0
         binlog_pos: 0
        binlog_file: NULL
       backup_state: complete
          operation: backup
          error_num: 0
        num_objects: 1
        total_bytes: 730478924
validity_point_time: 2008-07-08 17:27:49
         start_time: 2008-07-08 17:27:49
          stop_time: 2008-07-08 17:31:05
host_or_server_name: localhost
           username: root
        backup_file: /export/home/tmp/backup_dmp/dbb1_wocompr
       user_comment:
            command: backup database dbb to '/export/home/tmp/backup_dmp/dbb1_wocompr'
            engines: Snapshot
[11 Jul 2008 13:32] Susanne Ebrecht
Many thanks for reporting a bug but this is not a bug:

create database bug37980 character set utf8;
use bug37980;
create table t(id serial, t text character set utf8);
delimiter #
create procedure fill_t() 
begin 
declare i integer default 0; 
while i<100 do 
insert into t(t)
select(replace(make_set(1024*1024*1024*4*rand(),'4','b','c','a','s','f','g','h','i','j','p
','l','1','6','o','k','q','r','e','t','u','v','8','y','z','0','m','2','3','d','5','n','7',
'x','9','.',',','>','<'),',','')) as word;
set i = i + 1; 
end while; 
end#
delimiter ;
call fill_t;

backup database bug37980 to 't1';

backup database bug37980 to 't2' with compression compression_algorithm=gzip;

$ file t1
t1: data

$ file t2
t2: gzip compressed data, from Unix

$ ls -lh
-rw-rw---- 1 myuser mygroup 8,2K 2008-07-11 15:20 t1
-rw-rw---- 1 myuser mygroup 2,8K 2008-07-11 15:21 t2

as you can see the file is compressed.

total_bytes does not mean the size of the file it means how many bytes will be backuped.

Means: your database has x bytes and these x bytes will be backuped, doesn't matter how huge the file size is.

Look here:

bug37980$ ls -lh
-rw-rw---- 1 myuser mygroup 3328 2008-07-11 15:20 t.MYD
-rw-rw---- 1 myuser mygroup 4096 2008-07-11 15:20 t.MYI

The size of .MYI + .MYD = 7426

mysql> select * from mysql.online_backup\G
...
total_bytes: 7426
...

As you can see, it is the size of the database not of the backup files. The backup files have less bytes.
[15 Jul 2008 2:29] Hema Sridharan
The above explanation can hold true only with myisam storage engine (they store data in .MYI and .MYD format). But I can also take backup from Innodb, Falcon and other storage engines. 

Innodb and other engines manages table contents differently than the myisam engine (no MYI/MYD here). Myisam storage engine should not become a benchmark for all other storage engines and eventually there shouldn't be any discrepancy between the backed up file vs online backup file size.  Please see the example below,

CREATE DATABASE y;
USE y;
CREATE TABLE t1(id int)ENGINE=INNODB;
INSERT INTO t1 VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(0);
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;

INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;

BACKUP DATABASE y to 'y.bak';
BACKUP DATABASE y to 'y.bak.gz' WITH COMPRESSION;

-rw-rw----  1 ndbdev ndbdev  16K Jul 15 03:13 y.bak
-rw-rw----  1 ndbdev ndbdev  331 Jul 15 03:13 y.bak.gz

(From data directory)
y> ls -lrt
total 16
-rw-rw----  1 ndbdev ndbdev 8556 Jul 15 03:12 t1.frm
-rw-rw----  1 ndbdev ndbdev   65 Jul 15 03:12 db.opt

from online backup table;
***************************5. row ***************************
        total_bytes: 7680
        ......
        backup_file: /data1/backup_dmp/y.bak
       user_comment:
            command: backup database y to '/data1/backup_dmp/y.bak'
            engines: Snapshot
*************************** 6. row ***************************
          
        total_bytes: 7680
        ......
        backup_file: /data1/backup_dmp/y.bak.gz
       user_comment:
            command: backup database y to '/data1/backup_dmp/y.bak.gz' with compression
            engines: Snapshot

From the above 
backup file size: 16K  (uncompressed) 
backup file size: 331bytes (compressed)
from online backup table: 7680 bytes ~ 7.5K (for both compressed and uncompressed)
[15 Jul 2008 14:48] Hema Sridharan
I think that total_bytes (in backup table)should show the size of the resulting backup file but it is not showing it. Example are mentioned above.
[15 Jul 2008 15:18] Rafal Somla
I'm not sure if it is realistic to expect that total_bytes reports the size of the resulting backup file. For one thing, there are scenarios where backup is done to e.g., a named pipe and then we have no file whose size could be masured. Also, a file is an operating system concept and I don't think a database server should try to access or refer to it directly.

My suggestion would be that total_bytes shows the number of bytes processed by a BACKUP/RESTORE command, which can be different from the size of the resulting backup file.

I think we need more discussion about that and make a clear decision about what should be reported in total_bytes. Only then we can fully establish whether we have a bug or not and fix it if needed. Hence, setting the status to "To be fixed later".
[26 Aug 2008 15:12] Lars Thalmann
I think there are three alternatives for the meaning of the size:

1. Processed data inside the backup kernel
2. Backup Image file size *before* compression/encryption
3. Backup Image file size *after* compression/encryption

I think 3 is most useful and user friendly.
[13 Aug 2009 14:53] Chuck Bell
The original intent of this bug is a duplicate of a later bug BUG#41898. By solving BUG#41898, the solution supercedes the problems and theories/conclusions stated here.