Bug #48396 Compressed Binlog
Submitted: 29 Oct 2009 0:27
Reporter: Mikiya Okuno Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Luis Soares CPU Architecture:Any

[29 Oct 2009 0:27] Mikiya Okuno
Description:
Many people use binlogs for backup purpose as well as replication. For such purpose, automatic binlog compressing comes in handy a lot, which will reduce amount of disk spaces spent.

How to repeat:
n/a

Suggested fix:
- Compress binlogs when rotating it (Server restart, FLUSH LOGS, when exceeds max_binlog_size etc)
- Old binlog is safe to compress unless it is read by other threads, or block them using a mutex.
- Allow binlog subsystem to maintain .gz files by keeping filenames with .gz suffix in binlog index file.
- Use zlib to read compressed binlog files.
[29 Oct 2009 8:54] Peter Laursen
Would you use .gz/zlib on Windows too? 

Not sure about other platforms than Windows and Linux, but I do not think .gz is the obvious compression format on all platforms.
[29 Oct 2009 17:28] Luis Soares
Seems duplicate of BUG#46435.
[4 Aug 2015 7:59] zhou choury
I'm a dba of Tencent Inc.we made an idea of compressing
a binlog when generating and we have already implemented it.

The solution is as follows:
We added some event types for the compressed edition of event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

We also added two option for this feature: "log_bin_compress " and "log_bin_compress_min_len", the
former is a switch of whether the binlog should be compressed and the latter is the minimum length of
sql statement(in statement mode) or record(in row mode) that can be compressed. All can be described
by the code:

	if binlog_format == statement {
  		if log_bin_compress == true and query_len >= log_bin_compress_min_len
     		create a Query_compressed_log_event;
  		else
     		create a Query_log_event;
	}
	if binlog_format == row {
  		if log_bin_compress == true and record_len >= log_bin_compress_min_len
     		create a Write_rows_compressed_log_event(when INSERT)
  		else
     		create a Write_log_event(when INSERT);
	}

The complete change for MySQL 5.6.25 can be found by: 
https://github.com/choury/mysql-server/commit/b0337044942a92dc6f5e4059032a532a60d04862
and two micro fixes: 
https://github.com/choury/mysql-server/commit/c791c62aaf042a47274e42847c6a95d4a9723640
https://github.com/choury/mysql-server/commit/40436ac0b639cb5528b547b6cc702e4ad32fb337

We have tested it on some of our games for months, and the result is obvious: the amount of binlog
is reduced by 42% ~ 70%. We will be very glad if you can accept our patch.
[13 Jan 2016 22:50] Daniel Black
Nice start zhou. I made a couple of comments on your github. A patch against 5.7 will be easier to merge.

How much does this offer over binlog_row_image={minimal|blob}?

Does it work when combined with binlog_row_image=minimal/blob?