Bug #37980 | Incorrect backup file size in online backup table. | ||
---|---|---|---|
Submitted: | 9 Jul 2008 3:52 | Modified: | 13 Aug 2009 14:53 |
Reporter: | Hema Sridharan | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Backup | Severity: | S3 (Non-critical) |
Version: | mysql-6.0-backup | OS: | Linux |
Assigned to: | Rafal Somla | CPU Architecture: | Any |
[9 Jul 2008 3:52]
Hema Sridharan
[11 Jul 2008 13:32]
Susanne Ebrecht
Many thanks for reporting a bug but this is not a bug: create database bug37980 character set utf8; use bug37980; create table t(id serial, t text character set utf8); delimiter # create procedure fill_t() begin declare i integer default 0; while i<100 do insert into t(t) select(replace(make_set(1024*1024*1024*4*rand(),'4','b','c','a','s','f','g','h','i','j','p ','l','1','6','o','k','q','r','e','t','u','v','8','y','z','0','m','2','3','d','5','n','7', 'x','9','.',',','>','<'),',','')) as word; set i = i + 1; end while; end# delimiter ; call fill_t; backup database bug37980 to 't1'; backup database bug37980 to 't2' with compression compression_algorithm=gzip; $ file t1 t1: data $ file t2 t2: gzip compressed data, from Unix $ ls -lh -rw-rw---- 1 myuser mygroup 8,2K 2008-07-11 15:20 t1 -rw-rw---- 1 myuser mygroup 2,8K 2008-07-11 15:21 t2 as you can see the file is compressed. total_bytes does not mean the size of the file it means how many bytes will be backuped. Means: your database has x bytes and these x bytes will be backuped, doesn't matter how huge the file size is. Look here: bug37980$ ls -lh -rw-rw---- 1 myuser mygroup 3328 2008-07-11 15:20 t.MYD -rw-rw---- 1 myuser mygroup 4096 2008-07-11 15:20 t.MYI The size of .MYI + .MYD = 7426 mysql> select * from mysql.online_backup\G ... total_bytes: 7426 ... As you can see, it is the size of the database not of the backup files. The backup files have less bytes.
[15 Jul 2008 2:29]
Hema Sridharan
The above explanation can hold true only with myisam storage engine (they store data in .MYI and .MYD format). But I can also take backup from Innodb, Falcon and other storage engines. Innodb and other engines manages table contents differently than the myisam engine (no MYI/MYD here). Myisam storage engine should not become a benchmark for all other storage engines and eventually there shouldn't be any discrepancy between the backed up file vs online backup file size. Please see the example below, CREATE DATABASE y; USE y; CREATE TABLE t1(id int)ENGINE=INNODB; INSERT INTO t1 VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(0); INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; INSERT INTO t1 SELECT * FROM t1; BACKUP DATABASE y to 'y.bak'; BACKUP DATABASE y to 'y.bak.gz' WITH COMPRESSION; -rw-rw---- 1 ndbdev ndbdev 16K Jul 15 03:13 y.bak -rw-rw---- 1 ndbdev ndbdev 331 Jul 15 03:13 y.bak.gz (From data directory) y> ls -lrt total 16 -rw-rw---- 1 ndbdev ndbdev 8556 Jul 15 03:12 t1.frm -rw-rw---- 1 ndbdev ndbdev 65 Jul 15 03:12 db.opt from online backup table; ***************************5. row *************************** total_bytes: 7680 ...... backup_file: /data1/backup_dmp/y.bak user_comment: command: backup database y to '/data1/backup_dmp/y.bak' engines: Snapshot *************************** 6. row *************************** total_bytes: 7680 ...... backup_file: /data1/backup_dmp/y.bak.gz user_comment: command: backup database y to '/data1/backup_dmp/y.bak.gz' with compression engines: Snapshot From the above backup file size: 16K (uncompressed) backup file size: 331bytes (compressed) from online backup table: 7680 bytes ~ 7.5K (for both compressed and uncompressed)
[15 Jul 2008 14:48]
Hema Sridharan
I think that total_bytes (in backup table)should show the size of the resulting backup file but it is not showing it. Example are mentioned above.
[15 Jul 2008 15:18]
Rafal Somla
I'm not sure if it is realistic to expect that total_bytes reports the size of the resulting backup file. For one thing, there are scenarios where backup is done to e.g., a named pipe and then we have no file whose size could be masured. Also, a file is an operating system concept and I don't think a database server should try to access or refer to it directly. My suggestion would be that total_bytes shows the number of bytes processed by a BACKUP/RESTORE command, which can be different from the size of the resulting backup file. I think we need more discussion about that and make a clear decision about what should be reported in total_bytes. Only then we can fully establish whether we have a bug or not and fix it if needed. Hence, setting the status to "To be fixed later".
[26 Aug 2008 15:12]
Lars Thalmann
I think there are three alternatives for the meaning of the size: 1. Processed data inside the backup kernel 2. Backup Image file size *before* compression/encryption 3. Backup Image file size *after* compression/encryption I think 3 is most useful and user friendly.