| Bug #36819 | ut_usectime does not handle errors from gettimeofday | ||
|---|---|---|---|
| Submitted: | 20 May 2008 17:57 | Modified: | 15 Jan 17:24 |
| Reporter: | Mark Callaghan | ||
| Status: | Closed | ||
| Category: | Server: InnoDB | Severity: | S3 (Non-critical) |
| Version: | 5.0.37, 5.1 | OS: | Linux |
| Assigned to: | Timothy Smith | Target Version: | 5.0+ |
| Tags: | innodb, gettimeofday, ut_usectime | ||
| Triage: | Triaged: D3 (Medium) | ||
[20 May 2008 18:47]
Heikki Tuuri
Vasil, please fix this little bug. http://nixdoc.net/man-pages/Linux/gettimeofday.2.html " RETURN VALUE [Toc] [Back] gettimeofday and settimeofday return 0 for success, or -1 for failure (in which case errno is set appropriately). ERRORS [Toc] [Back] EPERM settimeofday is called by someone other than the superuser. EINVAL Timezone (or something else) is invalid. EFAULT One of tv or tz pointed outside your accessible address space. " Regards, Heikki
[20 May 2008 19:26]
Miguel Solorzano
Thank you for the bug report.
[21 May 2008 12:16]
Vasil Dimov
Mark,
would you be happy with an assert if gettimeofday(2) returns error?
I wonder what is the reason for it failing at all in the first place:
EFAULT One of tv or tz pointed outside the accessible address space.
EINVAL Timezone (or something else) is invalid.
EPERM The calling process has insufficient privilege to call settimeofday();
under Linux the CAP_SYS_TIME capability is required.
EPERM cannot be because we are not calling settimeofday(2)
EINVAL cannot be because timezone is NULL and what "else"?
EFAULT tv is in the stack of the same function and tz is NULL
We are definitely going to print an error message with something like strerror(errno), but
the question is how to proceed afterwards...
[21 May 2008 14:55]
Vasil Dimov
Mark, in 5.0 ut_usectime() is only used to update the variable innodb_row_lock_time_max in SHOW STATUS output. Is your problem related to this variable? What do you see for its value?
[21 May 2008 15:17]
Mark Callaghan
Assert would be worse. It fails intermittently for reasons that I do not understand. The rate of failure is probably a few times per server per day and there are probably millions of calls to it per server per day.
[21 May 2008 15:18]
Mark Callaghan
I noticed this while looking at the IO accounting patch from Percona. They use the function to measure IO wait time. I agree it won't have much of an impact on current code.
[21 May 2008 15:33]
Vasil Dimov
Then the question is what to do if gettimeofday(2) fails, probably something like this
makes most sense:
--- cut ---
int
ut_usectime(
/*========*/
/* out: 0 on success, -1 otherwise */
ulint* sec, /* out: seconds since the Epoch */
ulint* ms) /* out: microseconds since the Epoch+*sec */
{
struct timeval tv;
int ret;
int errno_gettimeofday;
int i;
for (i = 0; i < 10; i++) {
ret = ut_gettimeofday(&tv, NULL);
if (ret == -1) {
errno_gettimeofday = errno;
ut_print_timestamp(stderr);
fprintf(stderr, " InnoDB: gettimeofday(): %s\n",
strerror(errno_gettimeofday));
os_thread_sleep(100000); /* 0.1 sec */
errno = errno_gettimeofday;
} else {
break;
}
}
if (ret != -1) {
*sec = (ulint) tv.tv_sec;
*ms = (ulint) tv.tv_usec;
}
return(ret);
}
--- cut ---
This retries several times and, if unsuccessful, propagates the error to the caller.
[25 Jun 2008 0:07]
Calvin Sun
Merged into 6.0.6-alpha, according to Tim. But the patch has not been pushed into 5.0, 5.1 yet.
[15 Jul 2008 22:09]
Paul DuBois
Noted in 6.0.6 changelog. The code for the ut_usectime() function in InnoDB did not handle errors from the gettimeofday() system call. Now it retries gettimeofday() several times and updates the value of the Innodb_row_lock_time_max status variable only if ut_usectime() was successful. Setting report to Patch queued pending push of fix into 5.1.x or other trees.
[12 Dec 2008 23:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/61571 2742 Timothy Smith 2008-12-12 Apply the rest of innodb-5.0-ss2475. This fixes Bug#36819, "ut_usectime does not handle errors from gettimeofday". r2475 | vasil | 2008-05-22 19:35:30 +0300 (Thu, 22 May 2008) | 13 lines Fix by retrying gettimeofday() several times if it fails in ut_usectime(). If it fails on all calls then return error to the caller to be handled at higher level. Update the variable innodb_row_lock_time_max in SHOW STATUS output only if ut_usectime() was successful.
[6 Jan 14:57]
Bugs System
Pushed into 5.0.76 (revid:joro@sun.com-20090105160414-8q9j4bi1klkfwiup) (version source revid:azundris@mysql.com-20081230114734-nmsc37ak330zlygn) (merge vers: 5.0.76) (pib:6)
[8 Jan 22:22]
Paul DuBois
Noted in 5.0.76. Setting report to NDI pending push into 5.1.x.
[15 Jan 7:34]
Bugs System
Pushed into 5.1.31 (revid:joro@sun.com-20090115053147-tx1oapthnzgvs1ro) (version source revid:azundris@mysql.com-20081230114838-cn52tu180wcrvh0h) (merge vers: 5.1.31) (pib:6)
[15 Jan 17:24]
Paul DuBois
Noted in 5.1.31 changelog.
[19 Jan 12:33]
Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090115073240-1wanl85vlvw2she1) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 14:10]
Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 17:15]
Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)
[20 Jan 20:01]
Bugs System
Pushed into 6.0.10-alpha (revid:joro@sun.com-20090119171328-2hemf2ndc1dxl0et) (version source revid:azundris@mysql.com-20081230114916-c290n83z25wkt6e4) (merge vers: 6.0.9-alpha) (pib:6)

Description: The code for ut_usectime ignores errors returned by gettimeofday. gettimeofday fails intermittently on my servers and the man page states that it can fail. void ut_usectime( /*========*/ ulint* sec, /* out: seconds since the Epoch */ ulint* ms) /* out: microseconds since the Epoch+*sec */ { #ifdef __WIN__ SYSTEMTIME st; GetLocalTime(&st); *sec = (ulint) st.wSecond; *ms = (ulint) st.wMilliseconds; #else struct timeval tv; gettimeofday(&tv,NULL); *sec = (ulint) tv.tv_sec; *ms = (ulint) tv.tv_usec; #endif } How to repeat: Modify ut_usectime to check the error return value and log a message when an error occurs. Suggested fix: Check the return value of gettimeofday