| Bug #44276 | NdbTick assert() in Ndbif.cpp | ||
|---|---|---|---|
| Submitted: | 15 Apr 2009 0:52 | Modified: | 14 May 2009 13:32 |
| Reporter: | Stewart Smith | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
| Version: | OS: | Windows | |
| Assigned to: | jack andrews | CPU Architecture: | Any |
[15 Apr 2009 10:42]
jack andrews
<magnus> yes, 'GetSystemTimeAsFileTime' - you can look in mysys/my_getsystime.c for how it's used there.
[15 Apr 2009 12:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/72152 2868 jack andrews 2009-04-15 bug#44276 NdbTick assert() in Ndbif.cpp
[16 Apr 2009 11:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/72269 2868 jack andrews 2009-04-16 bug#44276 NdbTick assert() in Ndbif.cpp
[16 Apr 2009 11:17]
Bugs System
Pushed into 5.1.32-ndb-7.0.6 (revid:jack@sun.com-20090416111503-00o2025g2cav4s78) (version source revid:jack@sun.com-20090416103049-hn283mtm4t2ca7v3) (merge vers: 5.1.32-ndb-7.0.6) (pib:6)
[16 Apr 2009 11:40]
Magnus Blåudd
Very good!
[16 Apr 2009 19:10]
Jon Stephens
Need to know for documentation: 1. Do we support running Cluster inside a VM for production? (AFAIK we do not.) 2. Could a user encounter this issue without running Cluster in a VM? If so, how? Thanks.
[11 May 2009 12:52]
Magnus Blåudd
Changed to use a more stable function to retrieve current time
[14 May 2009 13:32]
Jon Stephens
Documented bugfix in the NDB-7.0.6 changelog as follows:
An NDB internal timing function did not work correctly on Windows and
could cause mysqld to fail on some AMD processors or when running inside
a virtual machine.

Description: I hit an assert() in mysqld today: From src/ndbapi/Ndbif.cpp: void Ndb::check_send_timeout() { Uint32 timeout = theImpl->m_transporter_facade->m_waitfor_timeout; NDB_TICKS current_time = NdbTick_CurrentMillisecond(); assert(current_time >= the_last_check_time); This assert() can be hit on Windows. This is because of our NdbTick_CurrentMillisecond() implementation there. From storage/ndb/src/common/portlib/win32/NdbTick.c: int NdbTick_CurrentMicrosecond(NDB_TICKS * secs, Uint32 * micros) { LARGE_INTEGER liCount, liFreq; QueryPerformanceCounter(&liCount); QueryPerformanceFrequency(&liFreq); *secs = liCount.QuadPart / liFreq.QuadPart; liCount.QuadPart -= *secs * liFreq.QuadPart; *micros = (Uint32)((liCount.QuadPart*1000000) / liFreq.QuadPart); return 0; } It turns out that QueryPerformance counter can: - on AMD systems, vary between CPUs - in VirtualBox (and wlad says VMWare), go backwards. So this should be solved one way or the other. the IRC chat in #wtf included below for completeness (with a bit of discussion): Apr 15 01:07:05 <stewart> anyone ever seen the QueryPerformanceCounter go backwards? Apr 15 01:07:13 <stewart> iggy, reggie: ^ ? Apr 15 01:07:17 <stewart> wlad: ? Apr 15 01:07:39 <iggy> stewart: Please explain more. Apr 15 01:07:40 <wlad> stewart, yes. I've seen that Apr 15 01:07:45 <iggy> I think it can happen. Apr 15 01:07:46 <stewart> wlad: ooerrr... cool :) Apr 15 01:07:50 <wlad> on AMD multicores Apr 15 01:07:57 <stewart> wlad: add "inside VirtualBox" to that Apr 15 01:08:15 <wlad> inside the VMWare :) Apr 15 01:08:35 <stewart> wlad: so... i think this means our NdbTick imple mentation is wrong... Apr 15 01:08:55 <wlad> stewart, rather QueryPerformanceCounter is wrong.. Apr 15 01:09:01 <stewart> wlad: or... that assert is bad.... but i wouldn't put money on it not be relied on elsewhere... Apr 15 01:09:24 <reggie> stewart, docs say that with multiple cores you can get different values when it hits different cores or cpus Apr 15 01:09:31 <stewart> darn... Apr 15 01:09:35 <reggie> this can make it seem to go backward Apr 15 01:09:43 <stewart> exactly the assert i hit Apr 15 01:09:49 <reggie> you need to use SetThreadAffinity to force it Apr 15 01:09:58 <stewart> kind of ugly :) Apr 15 01:10:12 <wlad> MS has had some articles on that. Harware vendors are tricky and sometimes use RDTSC as timer. to improve performance Apr 15 01:10:13 <reggie> SetThreadAffinityMask rather Apr 15 01:10:30 <stewart> probably either need to remove assert, or use more reliable tick.... Apr 15 01:10:30 <wlad> there was a boot parameter that could prevent it. Apr 15 01:10:59 <reggie> stewart, is this in a multi-cpu vm? Apr 15 01:11:32 <stewart> reggie: no. Apr 15 01:11:50 <stewart> afaik virtualbox only supports 1cpu Apr 15 01:13:33 <wlad> stewart: http://support.microsoft.com/kb/895980 is the KB article. The essence of their advice is to add /usepmtimer parameter to the boot.ini Apr 15 01:16:39 <wlad> stewart: more reliable but less exact counter is GetTickCount. The ugly thing about it , it wraps around in 49.7 days Apr 15 01:17:29 <stewart> hrrm... Apr 15 01:17:37 <stewart> perhaps will have to take to list Apr 15 01:17:43 <stewart> see what others thing Apr 15 01:17:47 <stewart> s/thing/think/ Apr 15 01:17:58 * stewart doesn't too much mind it going backwards a bit... Apr 15 01:18:03 <stewart> just have to audit some code Apr 15 01:18:30 <wlad> or GetSystemTimeAsFileTime , uses the same timer as GetTickCount, does not wrap How to repeat: run AMD, VirtualBox, VMWare and be "lucky" Suggested fix: change to GetSystemTimeAsFileTime