Bug #79529 The page compression did not changed size of resulting .idb files
Submitted: 4 Dec 2015 17:17 Modified: 25 Nov 2016 18:27
Reporter: Marcin Danilewicz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.7.9, 5.7.10 OS:Fedora (FC16_64, Kernel 3.1.0-7.fc)
Assigned to: CPU Architecture:Any

[4 Dec 2015 17:17] Marcin Danilewicz
Description:
The page compression (Transparent Page Compression) did not changed size of resulting database files, using MySQL 5.7.9 (GA) build from sources using "bundled" zlib. Database files were using ext4 Linux partition. File sizes and allocation sizes were tested using: 
- information_schema details,
- with "ls" and "du" command-line tools

This is just short example, but showing real results:
# ls -hls ../mysql/mysql/data/test/
633M -rw-r----- 1 mysql mysql 632M … uncompressed_data.ibd
633M -rw-r----- 1 mysql mysql 632M …  compressed_data.ibd <-- compression level set to 9 and using zlib

To be precise, not more than 4% difference sometimes was observed.
 
Page compression is perfectly capable of doing that, because:
- profiling showed CPU high usage caused by compression,
- amount of data sent to storage drives is significantly lower for compressed tables,
- binary compressed data are actually to be found inside .ibd files.

Note: However LZ4 compression was not been tested at all, but by looking at the MySQL code I don't expect any difference.

How to repeat:
1. download http://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-5.7.9.tar.gz
2. Build and install MySQL using generic instructions.
4. Create data table(s) with page compression enabled.

Following were used to create tables:
CREATE TABLE `compressed_data` (
  `cat_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cat_title` varbinary(255) NOT NULL DEFAULT '',
  `cat_pages` int(11) NOT NULL DEFAULT '0',
  `cat_subcats` int(11) NOT NULL DEFAULT '0',
  `cat_files` int(11) NOT NULL DEFAULT '0',
  `cat_comment` char(120) not null default '',
  PRIMARY KEY (`cat_id`),
  UNIQUE KEY `cat_title` (`cat_title`),
  KEY `cat_pages` (`cat_pages`)
) ENGINE=InnoDB COMPRESSION='zlib'; 

CREATE TABLE `uncompressed_data` (
  `cat_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cat_title` varbinary(255) NOT NULL DEFAULT '',
  `cat_pages` int(11) NOT NULL DEFAULT '0',
  `cat_subcats` int(11) NOT NULL DEFAULT '0',
  `cat_files` int(11) NOT NULL DEFAULT '0',
  `cat_comment` char(120) not null default '',
  PRIMARY KEY (`cat_id`),
  UNIQUE KEY `cat_title` (`cat_title`),
  KEY `cat_pages` (`cat_pages`)
) ENGINE=InnoDB; 

Then filled enough with data as here:
INSERT INTO <destination_table_name> SELECT * FROM <reference_source_table>;

5. No difference between space assigned and space occupied (difference less than 4% sometimes was observed). 

Suggested fix:
I've added more debug code and looking at the debug data, profiling and disk IO, it turned out that disabling hole punch were the root cause of all that. Following code in os0file.cc is a bit odd and purpose is a bit unknown to me. That is why I've finally removed it completely for testing.

		if (compressed_len >= old_compressed_len) {

			ut_ad(old_compressed_len <= UNIV_PAGE_SIZE);

			type.clear_punch_hole();
		} 

Results? Far much better and to me issue is fixed, data has not been corrupted .. but obviously I have not tested everything.

# ls -hls ../mysql/mysql/data/test/
633M -rw-r----- 1 mysql mysql 632M … uncompressed_data.ibd
222M -rw-r----- 1 mysql mysql 632M …  compressed_data.ibd

Applying prove of concept fix:
1. download http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.7.9.tar.gz
2. Remove code part as above from os0file.cc file.
3. Build and install MySQL using generic instructions.
4. Create data table(s) also with page compression.
5. This time difference between space assigned and space occupied (sparse file) should show.
[4 Dec 2015 18:40] MySQL Verification Team
When you create TABLE `compressed_data` is there a warning?. If yes please printe the output of SHOW WARNINGS?. Thanks.
[7 Dec 2015 9:49] Marcin Danilewicz
Hi Miguel,

no warning at all, with such query or MySQL error log.

/Marcin
[7 Dec 2015 23:21] MySQL Verification Team
Please check for duplicate: https://bugs.mysql.com/bug.php?id=77974.
[8 Dec 2015 15:42] Marcin Danilewicz
I disagree. 

MySQL log shows in my case :

"PUNCH HOLE support available" 

In addition server starts properly and initializes sparse files and punch holes with flags properly. Something else goes wrong later, I suppose. 

Here are my comments to https://bugs.mysql.com/bug.php?id=77974

> From innodb.cmake  file ,  HAVE_FALLOC_PUNCH_HOLE_AND_KEEP_SIZE is not set if the flag of FALLOC_FL_PUNCH_HOLE doesn't exists. 

Obviously in my case was set, otherwise it would not print "PUNCH HOLE support available". Include file code falloc.h has not much changed changed over time and you cannot tell the difference between these files from kernel 2.6.32 and 3.1.0-7. For 77974 case something else has to be wrong not falloc.h. I both cases you can find FALLOC_FL_PUNCH_HOLE and FALLOC_FL_KEEP_SIZE definitions there. 

>I am not sure if this is a common case, because i am not familiar with Linux kernel. So i filed this bug... :(
>kernel version = 2.6.32-358 el6.x86_64, 

My understanding of punch hole using sparse files is that it became available with kernel above 3.0, for ext4 partitions. This is my case, unknown file-system and older kernel version is not. By this issue 77974 doesn't seem to be a common case.

.. ah and I was using SSD drive if that makes another difference. 

Cheers,

/Marcin
[4 Feb 2016 12:47] MySQL Verification Team
Thank you for the feedback.
Observed this with 5.7.10 on OL7.

Thanks,
Umesh
[4 Feb 2016 12:47] MySQL Verification Team
// 5.7.10

bin/mysql_install_db --insecure --basedir=/export/umesh/server/source/bugs/79529/mysql-5.7.10 --datadir=/export/umesh/server/source/bugs/79529/mysql-5.7.10/80263 -v
bin/mysqld --basedir=/export/umesh/server/source/bugs/79529/mysql-5.7.10 --datadir=/export/umesh/server/source/bugs/79529/mysql-5.7.10/80263 --core-file --socket=/tmp/mysql_ushastry.sock  --port=15000 --log-error=/export/umesh/server/source/bugs/79529/mysql-5.7.10/80263/log.err 2>&1 &

[umshastr@hod03]/export/umesh/server/source/bugs/79529/mysql-5.7.10: cat /export/umesh/server/source/bugs/79529/mysql-5.7.10/80263/log.err|grep PUNCH
2016-02-04T12:33:07.543877Z 0 [Note] InnoDB: PUNCH HOLE support available

create database if not exists test;
use test;
drop table if exists t1;
CREATE TABLE `t1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50),
  `name2` char(50),
  `name3` char(50),
  `name4` char(50),
  PRIMARY KEY (`id`)
) ENGINE=innodb;

set @id:=0;

 insert into `t1` values 
  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
,  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
, (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
, (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000));

 insert into `t1`(`id`,`name1`,`name2`,`name3`,`name4`)
   select  @id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000)  from
    `t1` k1, `t1` k2, `t1` k3, `t1` k4,`t1` k5,`t1` k6, `t1` k7, `t1` k8, `t1` k9,
`t1` k0,`t1` ka, `t1` kb, `t1` kc, `t1` kd limit  5427738;  

create database if not exists test;
use test;
drop table if exists t2;
CREATE TABLE `t2` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50),
  `name2` char(50),
  `name3` char(50),
  `name4` char(50),
  PRIMARY KEY (`id`)
) ENGINE=innodb COMPRESSION='zlib';

set @id:=0;

 insert into `t2` values 
  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
,  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
, (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
, (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000));

 insert into `t2`(`id`,`name1`,`name2`,`name3`,`name4`)
   select  @id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000)  from
    `t2` k1, `t2` k2, `t2` k3, `t2` k4,`t2` k5,`t2` k6, `t2` k7, `t2` k8, `t2` k9,
`t2` k0,`t2` ka, `t2` kb, `t2` kc, `t2` kd limit  5427738;  

##

mysql> drop table if exists t1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE `t1` (
    ->   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    ->   `name1` char(50),
    ->   `name2` char(50),
    ->   `name3` char(50),
    ->   `name4` char(50),
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=innodb;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> set @id:=0;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>  insert into `t1` values
    ->   (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> ,  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> , (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> , (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000));

Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>
mysql>  insert into `t1`(`id`,`name1`,`name2`,`name3`,`name4`)
    ->    select  @id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000)  from
    ->     `t1` k1, `t1` k2, `t1` k3, `t1` k4,`t1` k5,`t1` k6, `t1` k7, `t1` k8, `t1` k9,
    -> `t1` k0,`t1` ka, `t1` kb, `t1` kc, `t1` kd limit  5427738;
Query OK, 5427738 rows affected (1 min 43.01 sec)
Records: 5427738  Duplicates: 0  Warnings: 0

mysql> optimize table t1;
+---------+----------+----------+-------------------------------------------------------------------+
| Table   | Op       | Msg_type | Msg_text                                                          |
+---------+----------+----------+-------------------------------------------------------------------+
| test.t1 | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.t1 | optimize | status   | OK                                                                |
+---------+----------+----------+-------------------------------------------------------------------+
2 rows in set (9.89 sec)

[umshastr@hod03]/export/umesh/server/source/bugs/79529/mysql-5.7.10: ls -lh 80263/test/t1.ibd
-rw-r----- 1 umshastr common 1.5G Feb  4 13:37 80263/test/t1.ibd

--

mysql> drop table if exists t2;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE `t2` (
    ->   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    ->   `name1` char(50),
    ->   `name2` char(50),
    ->   `name3` char(50),
    ->   `name4` char(50),
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=innodb COMPRESSION='zlib';
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> set @id:=0;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>  insert into `t2` values
    ->   (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> ,  (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> , (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000))
    -> , (@id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000));
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>
mysql>  insert into `t2`(`id`,`name1`,`name2`,`name3`,`name4`)
    ->    select  @id:=@id+1,md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000),md5(rand()*1000000)  from
    ->     `t2` k1, `t2` k2, `t2` k3, `t2` k4,`t2` k5,`t2` k6, `t2` k7, `t2` k8, `t2` k9,
    -> `t2` k0,`t2` ka, `t2` kb, `t2` kc, `t2` kd limit  5427738;
Query OK, 5427738 rows affected (1 min 49.59 sec)
Records: 5427738  Duplicates: 0  Warnings: 0

mysql> optimize table t2;
+---------+----------+----------+-------------------------------------------------------------------+
| Table   | Op       | Msg_type | Msg_text                                                          |
+---------+----------+----------+-------------------------------------------------------------------+
| test.t2 | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.t2 | optimize | status   | OK                                                                |
+---------+----------+----------+-------------------------------------------------------------------+
2 rows in set (51.24 sec)

[umshastr@hod03]/export/umesh/server/source/bugs/79529/mysql-5.7.10: ls -lh 80263/test/t2.ibd
-rw-r----- 1 umshastr common 1.5G Feb  4 13:43 80263/test/t2.ibd

--

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM  INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t1'\G
*************************** 1. row ***************************
         SPACE: 24
          NAME: test/t1
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 1568673792
1 row in set (0.00 sec)

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM  INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t2'\G
*************************** 1. row ***************************
         SPACE: 26
          NAME: test/t2
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 993042432
1 row in set (0.00 sec)
[27 Sep 2016 4:47] MySQL Verification Team
--

##
mysql>
mysql>  insert into `compressed_data` values
    ->   (@id:=@id+1,REPEAT('A',255),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,REPEAT('A',255),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,REPEAT('A',255),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,REPEAT('A',255),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120));
ERROR 1062 (23000): Duplicate entry 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' for key 'cat_title'
mysql> set @id:=0;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>  insert into `compressed_data` values
    ->   (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120));
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>
mysql>  insert into `compressed_data`(`id`,`name1`,`name2`,`name3`,`name4`)
    ->    select  @id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120)  from
    ->     `compressed_data` k1, `compressed_data` k2, `compressed_data` k3, `compressed_data` k4,`compressed_data` k5,`compressed_data` k6, `compressed_data` k7, `compressed_data` k8, `compressed_data` k9,
    -> `compressed_data` k0,`compressed_data` ka, `compressed_data` kb, `compressed_data` kc, `compressed_data` kd limit  5427738;
ERROR 1136 (21S01): Column count doesn't match value count at row 1
mysql>
mysql>  insert into `compressed_data`
    ->    select  @id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120)  from
    ->     `compressed_data` k1, `compressed_data` k2, `compressed_data` k3, `compressed_data` k4,`compressed_data` k5,`compressed_data` k6, `compressed_data` k7, `compressed_data` k8, `compressed_data` k9,
    -> `compressed_data` k0,`compressed_data` ka, `compressed_data` kb, `compressed_data` kc, `compressed_data` kd limit  5427738;
Query OK, 5427738 rows affected (36 min 23.87 sec)
Records: 5427738  Duplicates: 0  Warnings: 0

mysql> optimize table compressed_data;
+----------------------+----------+----------+-------------------------------------------------------------------+
| Table                | Op       | Msg_type | Msg_text                                                          |
+----------------------+----------+----------+-------------------------------------------------------------------+
| test.compressed_data | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.compressed_data | optimize | status   | OK                                                                |
+----------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (12 min 47.97 sec)

###
mysql>
mysql> set @id:=0;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>  insert into `uncompressed_data` values
    ->   (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120))
    -> , (@id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120));
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>  insert into `uncompressed_data`
    ->    select  @id:=@id+1,CONCAT(REPEAT('A',240),@id:=@id+1),@id:=@id+1,@id:=@id+1,@id:=@id+1,REPEAT('a',120)  from
    ->     `uncompressed_data` k1, `uncompressed_data` k2, `uncompressed_data` k3, `uncompressed_data` k4,`uncompressed_data` k5,`uncompressed_data` k6, `uncompressed_data` k7, `uncompressed_data` k8, `uncompressed_data` k9,
    -> `uncompressed_data` k0,`uncompressed_data` ka, `uncompressed_data` kb, `uncompressed_data` kc, `uncompressed_data` kd limit  5427738;

Query OK, 5427738 rows affected (34 min 22.49 sec)
Records: 5427738  Duplicates: 0  Warnings: 0

mysql>
mysql> optimize table `uncompressed_data`;
+------------------------+----------+----------+-------------------------------------------------------------------+
| Table                  | Op       | Msg_type | Msg_text                                                          |
+------------------------+----------+----------+-------------------------------------------------------------------+
| test.uncompressed_data | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.uncompressed_data | optimize | status   | OK                                                                |
+------------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (9 min 43.47 sec)

mysql>

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM  INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/compressed_data'\G
*************************** 1. row ***************************
         SPACE: 26
          NAME: test/compressed_data
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 4563402752
ALLOCATED_SIZE: 2029674496
1 row in set (0.00 sec)

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM  INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/uncompressed_data'\G
*************************** 1. row ***************************
         SPACE: 28
          NAME: test/uncompressed_data
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 4563402752
ALLOCATED_SIZE: 4563406848
1 row in set (0.00 sec)

##

Every 2.0s: ls -lh 83141/test/                                                                                                                                            Mon Sep 26 16:12:40 2016

total 6.2G
-rw-r----- 1 umshastr common 8.6K Sep 26 15:14 compressed_data.frm
-rw-r----- 1 umshastr common 4.3G Sep 26 15:26 compressed_data.ibd
-rw-r----- 1 umshastr common   65 Sep 26 14:26 db.opt
-rw-r----- 1 umshastr common 8.6K Sep 26 16:02 uncompressed_data.frm
-rw-r----- 1 umshastr common 4.3G Sep 26 16:12 uncompressed_data.ibd
[27 Sep 2016 5:18] Aakanksha Verma
Can reporter please share the results of information_schema allocated size
statistics with and without compression?because when i tried to run the same
test scenario that Umesh has given in the report, the compression based on
the allocated size statistics for a compressed table using zlib algorithm is
around 48.9%.
The above result I got on was ext4 FS on ubuntu mysql-5.7.9 (Having punch
hole support)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ ls -lh ./data/test/
total 2.3G
-rw-r----- 1 aakanksha aakanksha   65 Sep 16 10:10 db.opt
-rw-r----- 1 aakanksha aakanksha 8.5K Sep 16 10:36 t1.frm
-rw-r----- 1 aakanksha aakanksha 1.5G Sep 16 10:40 t1.ibd
-rw-r----- 1 aakanksha aakanksha 8.5K Sep 16 10:55 t2.frm
-rw-r----- 1 aakanksha aakanksha 1.5G Sep 16 11:00 t2.ibd

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM
INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t1'\G
*************************** 1. row ***************************
         SPACE: 24
          NAME: test/t1
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 1568673792
1 row in set (0.01 sec)

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM
INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t2'\G
*************************** 1. row ***************************
         SPACE: 26
          NAME: test/t2
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 895565824
1 row in set (0.00 sec)

%compression: 49%

So when Reporter says even after applying compression algorithm the
compression is less than 4%, Can the results be shared and the exact steps to
reproduce it.

Thanks,
Aakanksha
[27 Sep 2016 5:18] Aakanksha Verma
Can reporter please share the results of information_schema allocated size
statistics with and without compression?because when i tried to run the same
test scenario that Umesh has given in the report, the compression based on
the allocated size statistics for a compressed table using zlib algorithm is
around 48.9%.
The above result I got on was ext4 FS on ubuntu mysql-5.7.9 (Having punch
hole support)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ ls -lh ./data/test/
total 2.3G
-rw-r----- 1 aakanksha aakanksha   65 Sep 16 10:10 db.opt
-rw-r----- 1 aakanksha aakanksha 8.5K Sep 16 10:36 t1.frm
-rw-r----- 1 aakanksha aakanksha 1.5G Sep 16 10:40 t1.ibd
-rw-r----- 1 aakanksha aakanksha 8.5K Sep 16 10:55 t2.frm
-rw-r----- 1 aakanksha aakanksha 1.5G Sep 16 11:00 t2.ibd

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM
INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t1'\G
*************************** 1. row ***************************
         SPACE: 24
          NAME: test/t1
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 1568673792
1 row in set (0.01 sec)

mysql> SELECT SPACE, NAME, FS_BLOCK_SIZE, FILE_SIZE, ALLOCATED_SIZE FROM
INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE NAME='test/t2'\G
*************************** 1. row ***************************
         SPACE: 26
          NAME: test/t2
 FS_BLOCK_SIZE: 4096
     FILE_SIZE: 1568669696
ALLOCATED_SIZE: 895565824
1 row in set (0.00 sec)

%compression: 49%

So when Reporter says even after applying compression algorithm the
compression is less than 4%, Can the results be shared and the exact steps to
reproduce it.

Thanks,
Aakanksha
[30 Sep 2016 5:27] Aakanksha Verma
Documentation Notes
===================

The way we were verifying the file size was incorrect (ls -lh ) since when we punch a hole in a file which is called a sparse file .(Assuming OS has punch hole enabled) , the actual size would not be printed using command ls -lh file.
Thats why we couldn't witness the difference in file size.

So for such cases
 https://en.wikipedia.org/wiki/Sparse_file#Sparse_files_in_Unix
1) The -s option of the ls command shows the occupied space in blocks.

 ls -ls sparse-file

2) Alternatively, the du command prints the occupied space, while ls prints the apparent size. In some non-standard versions of du, the option --block-size=1 prints the occupied space in bytes instead of blocks, so that it can be compared to the ls output:

 du --block-size=1 sparse-file
 ls -l sparse-file

Using the above command i verified the same for tests done above:
mysql> show create table t1;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                   |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50) DEFAULT NULL,
  `name2` char(50) DEFAULT NULL,
  `name3` char(50) DEFAULT NULL,
  `name4` char(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5427743 DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ du --block-size=1 ./data/test/t1.ibd 
1375735808	./data/test/t1.ibd

mysql> show create table t2;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                      |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t2    | CREATE TABLE `t2` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50) DEFAULT NULL,
  `name2` char(50) DEFAULT NULL,
  `name3` char(50) DEFAULT NULL,
  `name4` char(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5427747 DEFAULT CHARSET=latin1 COMPRESSION='zlib' |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ du --block-size=1 ./data/test/t2.ibd 
702087168	./data/test/t2.ibd

Now we can clearly observe a compression for the ibd files to be around 48.9%

Side Note :- -h to du would give output in M/G etc

Would request the documentation team to kindly update the same for IO page compression documentation and also the reporter to kindly verify it for his own test case too.

Thanks
Aakanksha
[30 Sep 2016 5:28] Aakanksha Verma
Posted by developer:
 
Documentation Notes
===================

The way we were verifying the file size was incorrect (ls -lh ) since when we punch a hole in a file which is called a sparse file .(Assuming OS has punch hole enabled) , the actual size would not be printed using command ls -lh file.
Thats why we couldn't witness the difference in file size.

So for such cases
 https://en.wikipedia.org/wiki/Sparse_file#Sparse_files_in_Unix
1) The -s option of the ls command shows the occupied space in blocks.

 ls -ls sparse-file

2) Alternatively, the du command prints the occupied space, while ls prints the apparent size. In some non-standard versions of du, the option --block-size=1 prints the occupied space in bytes instead of blocks, so that it can be compared to the ls output:

 du --block-size=1 sparse-file
 ls -l sparse-file

Using the above command i verified the same for tests done above:
mysql> show create table t1;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                   |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50) DEFAULT NULL,
  `name2` char(50) DEFAULT NULL,
  `name3` char(50) DEFAULT NULL,
  `name4` char(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5427743 DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ du --block-size=1 ./data/test/t1.ibd 
1375735808	./data/test/t1.ibd

mysql> show create table t2;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                      |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t2    | CREATE TABLE `t2` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name1` char(50) DEFAULT NULL,
  `name2` char(50) DEFAULT NULL,
  `name3` char(50) DEFAULT NULL,
  `name4` char(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5427747 DEFAULT CHARSET=latin1 COMPRESSION='zlib' |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

aakanksha@aakanksha:~/work/bug_24412088/mysql/install$ du --block-size=1 ./data/test/t2.ibd 
702087168	./data/test/t2.ibd

Now we can clearly observe a compression for the ibd files to be around 48.9%

Side Note :- -h to du would give output in M/G etc

Would request the documentation team to kindly update the same for IO page compression documentation and also the reporter to kindly verify it for his own test case too.

Thanks
Aakanksha
[28 Oct 2016 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[25 Nov 2016 18:27] Daniel Price
Posted by developer:
 
The following ifnormation has been added to the InnoDB Page Compression documentation:

https://dev.mysql.com/doc/refman/5.7/en/innodb-page-compression.html

"On Unix-like systems, ls -l tablespace_name.ibd shows the apparent file
size (equivalent to FILE_SIZE) in bytes. To view the actual amount of
space allocated on disk (equivalent to ALLOCATED_SIZE), use du
--block-size=1 tablespace_name.ibd. The --block-size=1 option prints the
allocated space in bytes instead of blocks, so that it can be compared to
ls -l output." 

The change should appear online soon.

Thank you for the bug report.