Bug #41840 | Unbounded tablespace growth during recovery | ||
---|---|---|---|
Submitted: | 3 Jan 2009 13:57 | Modified: | 15 May 2009 14:11 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) |
Version: | 6.0-falcon-team | OS: | Any |
Assigned to: | Vladislav Vaintroub | CPU Architecture: | Any |
Tags: | F_RECOVERY |
[3 Jan 2009 13:57]
Philip Stoev
[25 Jan 2009 18:59]
Philip Stoev
Here is the log of one such recovery: # 20:48:05 New table space FALCON_USER, id 1, type 0, filename falcon_user.fts # 20:48:05 New table space FALCON_TEMPORARY, id 2, type 0, filename falcon_temporary.fts # 20:48:05 Recovering database /tmp/vardir/master-data_recovery/falcon_master.fts ... # 20:48:06 /nFirst recovery block is 5696 # 20:48:06 Last recovery block is 7582 # 20:48:06 Recovery read block is 6886 # 20:48:06 Recovery phase 1... # 20:48:06 Processed: 50913 # 20:48:06 # 20:48:06 Recovery phase 2... # 20:48:06 Processed: 50913 # 20:48:06 # 20:48:06 Recovery phase 3... # 20:48:06 === Scavenge Cycle 1 - FALCON_MASTER - 0 seconds # 20:48:06 4: Activity on /tmp/vardir/master-data_recovery/falcon_master.fts: 10 fetches, 5 reads, 0 writes, 1 flushWrites # 20:48:06 4: Activity on /tmp/vardir/master-data_recovery/falcon_temporary.fts: 0 fetches, 1 reads, 0 writes, 1 flushWrites # 20:48:06 4: Activity on /tmp/vardir/master-data_recovery/falcon_user.fts: 8878 fetches, 32 reads, 0 writes, 113 flushWrites # 20:55:26 Processed: 50913 # 20:55:26 Recovery complete # 20:55:26 443: Cache flush: 106 pages, 32 writes in 0 seconds (106 pps) # 20:55:26 443: Commit System Transaction 2 # 20:55:26 443: Commit System Transaction 4 # 20:55:26 443: CommitNoUpdates transaction 6 # 20:55:26 === Scavenge Cycle 2 - FALCON_MASTER - 0 seconds # 20:55:26 443: Transactions: 3 committed, 0 rolled back, 1 active, 8 available, 2 post-commit, oldest 0 seconds # 20:55:26 443: SerialLog: 0 reads, 4 writes, 0 transactions, 0 completed, 0 stalls, 2 blocks, 1 windows # 20:55:26 443: Activity on /tmp/vardir/master-data_recovery/falcon_master.fts: 2668 fetches, 71 reads, 0 writes, 9 flushWrites # 20:55:26 443: Activity on /tmp/vardir/master-data_recovery/falcon_user.fts: 897340 fetches, 1155 reads, 0 writes, 38880 flushWrites Notice "897340 fetches, 1155 reads, 0 writes, 38880 flushWrites". Is this OK?
[25 Jan 2009 19:31]
Vladislav Vaintroub
You did not say what you have done before you killed mysqld. The answer to "is this OK?" is then maybe it is ok and it depends on your workload. If you have not killed the app but just stopped the activity of the client application, it alsocould take hours before mysqld stops writing. This happens when gopher threads lag behind the client threads, which is presumably in your case. recovery has to recover ~51000 log records, so it might take both time and space
[26 Jan 2009 16:08]
Philip Stoev
This was observed when running this RQG command: $ perl runall.pl \ --mysqld=--falcon-page-size=8K \ --vardir=/tmp/vardir \ --rows=1000 \ --threads=32 \ --mask=8 \ --queries=1000000 \ --duration=300 \ --basedir=/build/bzr/6.0-falcon-team \ --engine=Falcon \ --grammar=conf/combinations.yy \ --gendata=conf/combinations.zz \ --validator=DatabaseConsistency \ --reporter=Deadlock,ErrorLog,Backtrace,Recovery \ --mysqld=--loose-falcon-lock-wait-timeout=1 \ --mysqld=--log-output=none Note that the --mask parameter means that only DELETEs are being generated. Data directory before recovery looks like this: -rw-rw---- 1 philips philips 75492352 2009-01-03 18:29 falcon_master.fl1 -rw-rw---- 1 philips philips 86007808 2009-01-03 18:29 falcon_master.fl2 -rw-rw---- 1 philips philips 1835008 2009-01-03 18:29 falcon_master.fts -rw-rw---- 1 philips philips 65536 2009-01-03 18:29 falcon_temporary.fts -rw-rw---- 1 philips philips 144130048 2009-01-03 18:29 falcon_user.fts the tablespace is 144 mb, log files are about 80mb each. After recovery: -rw-rw---- 1 philips philips 75492352 2009-01-03 18:28 falcon_master.fl1 -rw-rw---- 1 philips philips 86007808 2009-01-03 18:28 falcon_master.fl2 -rw-rw---- 1 philips philips 1835008 2009-01-03 18:28 falcon_master.fts -rw-rw---- 1 philips philips 65536 2009-01-03 18:29 falcon_temporary.fts -rw-rw---- 1 philips philips 3103191040 2009-01-03 18:29 falcon_user.fts Log files unchanged, the falcon_user tablespace is now 3.1 gigabytes.
[26 Jan 2009 16:29]
Philip Stoev
Another result from a grammar that uses transaction, update, select, replace, delete: $ perl runall.pl \ --mysqld=--falcon-page-size=8K \ --vardir=/tmp/vardir \ --rows=100 \ --threads=16 \ --mask=59 \ --queries=1000000 \ --duration=300 \ --basedir=/build/bzr/6.0-falcon-team \ --engine=Falcon \ --grammar=conf/combinations.yy \ --gendata=conf/combinations.zz \ --reporter=Deadlock,ErrorLog,Backtrace,Recovery \ --mysqld=--loose-falcon-lock-wait-timeout=1 \ --mysqld=--log-output=none Before recovery, the tablespace is 309 mb. -rw-rw---- 1 philips philips 73779712 2009-01-03 17:15 falcon_master.fl1 -rw-rw---- 1 philips philips 70248960 2009-01-03 17:15 falcon_master.fl2 -rw-rw---- 1 philips philips 1835008 2009-01-03 17:15 falcon_master.fts -rw-rw---- 1 philips philips 65536 2009-01-03 17:15 falcon_temporary.fts -rw-rw---- 1 philips philips 309608448 2009-01-03 17:15 falcon_user.fts After recovfery, it is 2.9 gigabytes -rw-rw---- 1 philips philips 73779712 2009-01-03 17:14 falcon_master.fl1 -rw-rw---- 1 philips philips 70248960 2009-01-03 17:14 falcon_master.fl2 -rw-rw---- 1 philips philips 1835008 2009-01-03 17:14 falcon_master.fts -rw-rw---- 1 philips philips 65536 2009-01-03 17:15 falcon_temporary.fts -rw-rw---- 1 philips philips 2973466624 2009-01-03 17:15 falcon_user.fts
[20 Apr 2009 18:41]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/72525 3129 Vladislav Vaintroub 2009-04-20 Bug #41840 Unbounded tablespace growth during recovery Sometimes, after recovery the tablespaces get huge, orders of magnitude larger than original tablespace. The problem was that page overflow pages were not deleted during recovery. If a record was modified several times during recovery, the overflow pages that were created during recovery also leaked. The fix is to remove overflow pages , when it is possible. - remove all SRLOverflowPages found in course of recovery - just like during normal runtime,when record gets modified in recovery, remove overflow pages if the removal is safe. Safe means, that page referenced is a valid dataoverflow page. Validity is guaranteed, if page is either not touched by recovery, or the page was allocated during recovery and not yet freed.
[21 Apr 2009 10:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/72569 3131 Vladislav Vaintroub 2009-04-21 Addendum to Bug #41840 Unbounded tablespace growth during recovery Problem: blob recovery might access invalid (free) page The reason is that page allocations are not recorded during recovery and recovery might assume that the last state (incarnation) of the page was "free", even if a piece of blob was written to it. Solution is to record blob pages (data or overflow pages or written with earlyWrite flag) but do nothing in recovery except incrementing the page incarnation. This way the page cannot not be accidentially freed.
[21 Apr 2009 10:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/72572 3131 Vladislav Vaintroub 2009-04-21 Addendum to Bug #41840 Unbounded tablespace growth during recovery Problem: blob recovery might access invalid (free) page The reason is that page allocations are not recorded during recovery and recovery might assume that the last state (incarnation) of the page was "free", even if a piece of blob was written to it. Solution is to record blob pages (data or overflow pages or written with earlyWrite flag) but do nothing in recovery except incrementing the page incarnation. This way the page cannot not be accidentially freed.
[23 Apr 2009 7:22]
Bugs System
Pushed into 6.0.11-alpha (revid:alik@sun.com-20090423071213-afmyrzvolemph7mz) (version source revid:hky@sun.com-20090421195958-j33v1cuo3yer9niu) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 14:11]
MC Brown
An entry has been added to the 6.0.11 changelog: When creating an index on a Falcon table with a very large dataset, mysqld would crash.