Bug #26643 | Falcon: no blob space recovery | ||
---|---|---|---|
Submitted: | 26 Feb 2007 18:59 | Modified: | 23 Oct 2007 20:24 |
Reporter: | Peter Gulutzan | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S5 (Performance) |
Version: | 5.2.4-falcon-alpha-debug | OS: | Linux (SUSE 10.0 / 64-bit) |
Assigned to: | Ann Harrison | CPU Architecture: | Any |
[26 Feb 2007 18:59]
Peter Gulutzan
[26 Feb 2007 23:55]
MySQL Verification Team
Thank you for the bug report. Verified as described.
[7 Sep 2007 18:18]
Jim Starkey
Sample doesn't compile: mysql> delimiter // mysql> create procedure p () -> begin -> declare v int default 0; -> while v < 10000 do -> if v mod 1000 = 0 then select v; end if; -> update t set s1 = repeat('a',rand()*1000); -> set v = v + 1; -> end while; -> end// ERROR 1307 (HY000): Failed to CREATE PROCEDURE p mysql>
[7 Sep 2007 18:32]
Peter Gulutzan
I've seen "ERROR 1307 (HY000): Failed to CREATE" before. It's a bug (there's nothing wrong with the procedure), and I suspect that it only occurs with Falcon, but I didn't report it because I couldn't repeat it. If you can repeat it, that's great!
[7 Sep 2007 18:55]
Kevin Lewis
Jim, With the Windows MYSQL.exe, the delimiter statement has to be followed by the old delimiter. That's maybe why it didn't work. This worked for me, but I am not sure I saw any increase in blob space. drop database k; create database k; use k; create table t (s1 mediumtext) engine=falcon; insert into t values (repeat('a',1000)); delimiter // ; create procedure p () begin declare v int default 0; while v < 100000 do if v mod 1000 = 0 then select v; end if; update t set s1 = repeat('a',rand()*1000); set v = v + 1; end while; end// delimiter ; // call p();
[19 Oct 2007 18:12]
Ann Harrison
The procedure modifies a record with a blob field 100,000 times, and Falcon does in fact, write 100,000 versions of the blob to disk. When the transaction commits, 999,999 of those versions are removed, and the space they use is released for reuse. The garbage collection does not occur during the execution of the transaction, but only at the end.
[22 Oct 2007 18:53]
Hakan Küçükyılmaz
Verified as described. The UPDATE happens on a single row with autocommit enabled. There should be no increase in the table space files. Before: lu0011:/data/mysql# du -h *.fts 412K falcon_master.fts 16K falcon_temporary.fts 16K falcon_user.fts After: lu0011:/data/mysql# du -h *.fts 448K falcon_master.fts 16K falcon_temporary.fts 5.0M falcon_user.fts Procedure used: SET @@autocommit = 1; delimiter // create database k// use k// create table t (s1 mediumtext) engine=falcon// insert into t values (repeat('a',1000))// create procedure p () begin declare v int default 0; while v < 10000 do if v mod 1000 = 0 then select v; end if; update t set s1 = repeat('a',rand()*1000); set v = v + 1; end while; end// call p()// After calling p() several times falcon_user.fts grows to 15M
[22 Oct 2007 19:28]
Ann Harrison
I did run the test and after the procedure completes, the next reference to that row causes the engine to clean out the old versions. Let me check that the commit is actually being seen in the autocommit transaction during a procedure.
[23 Oct 2007 19:13]
Kevin Lewis
Hakan, Each verb within a transaction is like a little SetSavepoint-ReleaseSavepoint. Usually, records that were changed within a transaction, and then changed again, are released at the end of the verb by Record::scavenge(). This in turn calls Table::garbageCollect() which does garbage collection on indexes, inversions, and blobs. If this did not work, the test case, with 10,000 updates of 1000 byte blobs, would have made a falcon_user.fts much greater than 10 Mb. But it was only 5 Mb after the test. It is not necessary for Falcon to keep tablespace growth to zero during a transaction of 10,000 updates to the same record. Falcon is not MYISAM and does not need to react the same. If it can be shown that garbage collection of BLOBs is NOT happening during a test like this, please re-open the bug. Otherwise, lets close it.
[23 Oct 2007 20:24]
Ann Harrison
As Kevin said, Falcon is not MyISAM or InnoDB. What you're seeing here is the effect of group commits. All the blobs in a group of transactions being committed at the same time must be written before the group can be committed. Once the group is committed, then the old versions can be scavenged. That's why there is growth, but the growth is less than would have occurred if blob space were not being reclaimed at all. When Jim pushes his changes causing small blobs to be logged instead of being written directly from the page cache to the database, the growth shown by this test will be less. Larger blobs (> 1M?) will still cause some growth when modified in a hard loop like this. The wasted space is recovered, but not immediately.
[24 Oct 2007 22:43]
Ann Harrison
I ran the test twelve times in one sequence and six times in another. falcon_user.fts started at 40,960. After one run of the procedure it grew to 1,097,728. The next run of the procedure caused it to grow to 5,255,168. The next run it increased to 5,271,552. It stayed that size for the next six runs, then increased slightly to 5,308,416 where it stayed for three more runs. On the second sequence, using Jim's tree rather than the team tree, the tablespace started at 16,384 then grew to 1,101,824, then 5,230,592 where it stayed steady for three runs. The next run increased the size of the tablespace slightly to 5,246,976 where it stayed for two more runs. The slight increases are consistent with the fact that blobs are stored at their actual length and the length is generated randomly. What's happening is that the blobs are being written when before the transaction commit is written. Each update is its own transaction and their commits are batched, so nothing can be released until the transactions are actually committed and written to the database. The performance is lousy because releasing a long chain of blobs is an n log n operation.