| Bug #22082 | Slave hangs(holds mutex) on "disk full" | ||
|---|---|---|---|
| Submitted: | 7 Sep 2006 16:04 | Modified: | 18 Mar 16:14 |
| Reporter: | Jonas Oreland | ||
| Status: | Closed | ||
| Category: | Server: Replication | Severity: | S3 (Non-critical) |
| Version: | Tried on 5.0.25, 5.0.51, 5.0.67 | OS: | Any |
| Assigned to: | Zhenxing He | Target Version: | 5.0+ |
| Triage: | Triaged: D2 (Serious) / R2 (Low) / E3 (Medium) | ||
[7 Sep 2006 16:04]
Jonas Oreland
[15 Oct 2008 19:30]
Sveta Smirnova
Same behavior with current version and sync-binlog
[23 Feb 12:03]
Lars Thalmann
Talked to Zhenxing about this bug: SUMMARIZING PROBLEMS ==================== (Same numbering as Jonas had) 1. SHOW SLAVE STATUS can't be done when disk full 2. Sleeping is longer than 60s 3. Still hangs after sleep 4. If slave is killed, then relay log corrupted. SOLUTION ======== 1. It is expected behaviour that SHOW SLAVE STATUS is hanging when the disk is full. After the disk is freed, then the command returns as expected. 2. Zhenxing has checked the code that reprint of message is longer after a while. This should be made clear in the error messages. 3. Zhenxing has confirm that it is not hanging. After the sleep, it does continue as it should (after the 60 second wait). We have not been able to reproduce the failure that Jonas describes (provided that he did wait the 60s). 4. In 6.0 there is a recovery mechanism that will delete any corrupted relay log. In 5.0-5.1, there is no such feature implemented and slaves are not fully crash-safe. If the server is killed, then the relay log may be corrupted. ACTION ====== To close this bug, we will only improve on the error message so that the time for the sleep messages are correctly specified: - Add something like this for the first message: "Expect 60 seconds delay for server to continue after the disk space has been freed". - The new message will be something like: "Retrying every 60 secs. Message will be reprinted in 600 secs." If someone can reproduce the problem with replication not continuing after the disk has been freed and one has waited over 60s, then we will re-open the bug. Zhenxing tested and can't repeat that problem.
[25 Feb 5:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67432 2731 He Zhenxing 2009-02-25 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[25 Feb 6:03]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67433 2731 He Zhenxing 2009-02-25 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[25 Feb 6:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67437 2731 He Zhenxing 2009-02-25 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[2 Mar 2:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67945 2731 He Zhenxing 2009-03-02 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[2 Mar 7:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67954 2731 He Zhenxing 2009-03-02 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[2 Mar 7:51]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/67955 2731 He Zhenxing 2009-03-02 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[6 Mar 10:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68456 2766 He Zhenxing 2009-03-06 BUG#22082 Slave hangs(holds mutex) on "disk full" When disk is full, server may waiting for free space while writing binlog, relay-log or MyISAM tables. The server will continue after user have freed some space. But the error message printed was not quite clear about the how often the error message is printed, and there will be a delay before the server continue and user freeing space. And caused users thinking that the server was hanging forever. This patch fixed the problem by making the error messages printed more clear. The error message is split into two part, the first part will only be printed once, and the second part will be printed very 10 times. Message first part: Disk is full writing '<filename>' (Errcode: <errorno>). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) Message second part: Retry in 60 secs, Message reprinted in 600 secs
[6 Mar 10:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68460 2836 He Zhenxing 2009-03-06 [merge] Merge BUG#22082 from 5.0-bugteam to 5.1-bugteam
[9 Mar 9:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/68590 3089 He Zhenxing 2009-03-09 [merge] Auto merge BUG#22082 from 5.1-bugteam to 6.0-bugteam
[9 Mar 9:53]
Zhenxing He
pushed to 5.0/5.1/6.0-bugteam
[9 Mar 15:12]
Bugs System
Pushed into 5.0.79 (revid:joro@sun.com-20090309135922-a0di9ebkxoj4d4wv) (version source revid:zhenxing.he@sun.com-20090306093200-4u6mq0jcu8ubcmqf) (merge vers: 5.0.79) (pib:6)
[10 Mar 17:08]
Jon Stephens
Documented in the 5.0.79 changelog as follows:
When its disk becomes full, a replication slave waits while
writing the binary log, relay log or MyISAM tables, continuing
after space has been made available. The error message provided
in such cases was not clear about the frequency with which
checking for free space is done (once every 60 seconds), and how
long the server waits after space has been freed before
continuing (also 60 seconds); this caused users to think that
the server had hung.
These issues have been addressed by making the error message
clearer, and dividing it into two separate messages:
1. The error message Disk is full writing 'filename' (Errcode:
error_code). Waiting for someone to free space... (Expect up
to 60 secs delay for server to continue after freeing disk
space) is printed only once.
2. The warning Retry in 60 secs, Message reprinted in 600 secs
is printed once every for every 10 times that the check for
free space is made; that is, the check is performed once each
60 seconds, but the reminder that space needs to be freed is
printed only once every 10 minutes (600 seconds).
Set status to NDI pending merges to 5.1 and 6.0 trees.
[13 Mar 20:04]
Bugs System
Pushed into 5.1.33 (revid:joro@sun.com-20090313111355-7bsi1hgkvrg8pdds) (version source revid:zhou.li@sun.com-20090311061050-ihp0g77znonq1tuq) (merge vers: 5.1.33) (pib:6)
[15 Mar 11:30]
Jon Stephens
Fix also noted in the 5.1.33 changelog; set back to NDI status pending merge to 6.0 tree.
[18 Mar 14:18]
Bugs System
Pushed into 6.0.11-alpha (revid:joro@sun.com-20090318122208-1b5kvg6zeb4hxwp9) (version source revid:matthias.leich@sun.com-20090310140952-gwtoq87wykhji3zi) (merge vers: 6.0.11-alpha) (pib:6)
[18 Mar 16:14]
Jon Stephens
Fix also documented in 6.0.11 changelog; closed.
[9 May 18:42]
Bugs System
Pushed into 5.1.34-ndb-6.2.18 (revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (version source revid:jonas@mysql.com-20090508100057-30ote4xggi4nq14v) (merge vers: 5.1.33-ndb-6.2.18) (pib:6)
[9 May 19:39]
Bugs System
Pushed into 5.1.34-ndb-6.3.25 (revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (version source revid:jonas@mysql.com-20090508175813-s6yele2z3oh6o99z) (merge vers: 5.1.33-ndb-6.3.25) (pib:6)
[9 May 20:36]
Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (version source revid:jonas@mysql.com-20090509073226-09bljakh9eppogec) (merge vers: 5.1.33-ndb-7.0.6) (pib:6)
