Bug #35848 UUID() returns UUIDs with the wrong time
Submitted: 4 Apr 2008 23:41 Modified: 29 Jul 2008 6:55
Reporter: David Garfield Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.0.22, 5.0.56 OS:Any
Assigned to: Tatiana Azundris Nuernberg CPU Architecture:Any
Tags: UUID()

[4 Apr 2008 23:41] David Garfield
Description:
I attempted to use the UUID() function, and decode the result.  I found that I got a time of Mon Jun 07 04:44:44 EST 1621 +6491880ms.  This was wrong.

In version 5.0.22, 5.0.56, and 6.0.4-alpha (all I have checked), there is a three line section in file(s) item_strfunc.cc reading:

/* number of 100-nanosecond intervals between
   1582-10-15 00:00:00.00 and 1970-01-01 00:00:00.00 */
#define UUID_TIME_OFFSET ((ulonglong) 141427 * 24 * 60 * 60 * 1000 * 10 )

The figure 141427 is the correct number of days, but the conversion to 100-nanosecond intervals should have another * 1000 in it.

How to repeat:
mysql> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 34c9dce8-53ef-102b-b191-xxxxxxxxxxxx | 
+--------------------------------------+

The third group of numbers should be 1133 from 2008-04-01 15:26:29.0982912 to 2009-02-21 10:04:17.2791807.  Currently, I get 102b, which is in the year 1621.
(host specific information changed to "x"s.)

Suggested fix:
change the line to read:

#define UUID_TIME_OFFSET ((ulonglong) 141427 * 24 * 60 * 60 * 1000 * 1000 * 10 )
[5 Apr 2008 7:31] Valeriy Kravchuk
Thank you for a problem report. Indeed, on 5.0.56 I've got:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P3308
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.0.56-enterprise-gpl-nt MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 8d3b33f1-5431-102b-8634-xxxxxxxxxxxx |
+--------------------------------------+
1 row in set (0.05 sec)

mysql> exit
Bye

And nanosecond is 1E-9 seconds (http://en.wikipedia.org/wiki/Nanosecond), so 100 nanosecnds is 1E-7, and one need to do *1000*1000*10 to recalculate seconds to this unit.

Note that on 5.1.23 and 6.0.4 I've got a bit different results:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P3310
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.23-rc-community-debug MySQL Community Server - Debug (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 42114800-6f22-1000-beeb-xxxxxxxxxxxx |
+--------------------------------------+
1 row in set (0.03 sec)

mysql> exit
Bye

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P3311
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.4-alpha-community MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 42114800-6f22-1000-9029-xxxxxxxxxxxx |
+--------------------------------------+
1 row in set (0.01 sec)
[7 Apr 2008 16:02] David Garfield
42114800-6f22-1000-beeb-xxxxxxxxxxxx and 42114800-6f22-1000-9029-xxxxxxxxxxxx are much worse.

First, they are March 4, 1583.

Second, they are EXACTLY the same time, which indicates a severe failure in generating UUIDs, unless you somehow managed to eliminate your system clock.
[10 Apr 2008 13:13] Tatiana Azundris Nuernberg
Formula confirmed as off by 1000. Fixing it like so
#define UUID_TIME_OFFSET ((ulonglong) 141427 * 24 * 60 * 60 * \
                          1000 * 1000 * 10)
renders correct results:
mysql> select uuid();
44601e20-06f1-11dd-9a40-

> date --utc                                    
Thu Apr 10 11:28:48 UTC 2008

> uuid -d "44601e20-06f1-11dd-9a40-000000000000"
encode: STR:     44601e20-06f1-11dd-9a40-000000000000
        SIV:     90886575221531278365900058155196350464
decode: variant: DCE 1.1, ISO/IEC 11578:1996
        version: 1 (time and node based)
        content: time:  2008-04-10 11:28:39.501776.0 UTC
                 clock: 6720 (usually random)
                 node:  00:00:00:00:00:00 (global unicast)

http://www.ossp.org/pkg/lib/uuid/
[10 Apr 2008 13:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/45193

ChangeSet@1.2714, 2008-04-10 15:39:29+02:00, tnurnberg@white.intern.koehntopp.de +3 -0
  Bug#35848: UUID() returns UUIDs with the wrong time
  
  offset for time part in UUIDs was 1/1000 of what it
  should be. In other words, offset was off.
[11 Apr 2008 9:38] Sergei Golubchik
Tatjana, what about UUID() results on Windows ?
[28 Apr 2008 2:33] Tatiana Azundris Nuernberg
Sergei,
UUID() mostly goes through the same codepath on Win, except for the MAC-address; I presume this is what you are referring to. The test is unaffected by this, it already disregards that part of the UUID (and all randomness) and checks for the correct time-base only. As that part is broken in the same way on all platforms, the fix is also portable; so is the test -- I verified its passing on Win86-32 (and its failing without the patch) for your convenience.
[28 Apr 2008 8:34] Sergei Golubchik
Tatjana, I mean the results that Valeriy has got - where two uuids have exactly the same timestamp. Any explanation for this ?
[6 May 2008 2:50] Tatiana Azundris Nuernberg
I cannot reproduce that:

two subsequent calls without patch:

uuid -d 1174ced1-6c61-102b-918a-
        content: time:  60076-12-02 17:56:.860513.7 UTC
                 clock: 4490 (usually random)

uuid -d 1174d71b-6c61-102b-918a-
        content: time:  60076-12-02 17:56:.860725.9 UTC
                 clock: 4490 (usually random)

two calls with patch:

uuid -d f0efb1d3-1b0e-11dd-8d9e-
        content: time:  2008-05-06 01:51:27.671547.5 UTC
                 clock: 3486 (usually random)

uuid -d f0efc1f2-1b0e-11dd-8d9e-
        content: time:  2008-05-06 01:51:27.671960.2 UTC
                 clock: 3486 (usually random)

Note that the fractional time differs for each call here. ('clock' stays the same, but this is intentional -- it is setup on first call and only changed when we detect the system-time having been changed to an earlier time, daylight savings or whatever, in which case we change 'clock' to a new random value so we won't accidentally re-issue the same UUIDs (so 'clock' is like salt).)
[6 May 2008 3:13] David Garfield
I think I have the explanation for the duplicate times.  The values where 42114800-6f22-1000-beeb-xxxxxxxxxxxx and 42114800-6f22-1000-9029-xxxxxxxxxxxx because time resolution had been lost.  Sufficient resolution had been lost and the commands were run at close enough time that they did get the same timestamp.

I guess the only thing that would still worry me is that multiple calls in the same 100ns increment (or however good the system clock resolution is) get distinct values.  In short, does 
    SELECT UUID(), UUID();
return two different values?  The code to do that does seem to be present in the version I am running (5.0.22), as occasionally demonstrated by running 
    SELECT LEFT(UUID(),8) FROM temporary_table_with_10_rows;
Finding it now may be a lot harder (and nice evidence of good performance).
[6 May 2008 11:24] Tatiana Azundris Nuernberg
This seems safe.
I tested with "select uuid(); select uuid();" and, more to the point, "select uuid(),uuid();", each render different results for each call of uuid(). In fact, looking at the code reveals this:

  // get time
  ulonglong tv=my_getsystime() + UUID_TIME_OFFSET + nanoseq;

  // if time < saved_last_time, system clock was adjusted backwards, make new salt
  if (unlikely(tv < uuid_time))
    set_clock_seq_str();

  // if this_time == last_time, force them to be different
  else if (unlikely(tv == uuid_time))
  { /* special protection from low-res system clocks */
    nanoseq++;
    tv++;
  }
[6 May 2008 11:27] Tatiana Azundris Nuernberg
I should also note that that part is wrapped in

  pthread_mutex_lock(&LOCK_uuid_generator);
  ...
  pthread_mutex_unlock(&LOCK_uuid_generator);
[6 May 2008 14:03] Tatiana Azundris Nuernberg
my_getsystime() returns "time since epoch in 100 nanosec units" (this is also the resolution of UUID). The most precise time-stamp we can get is used for this, but if that's not very good, the actual resolution may be lower (getTOD, "tv.tv_usec*10")

Consider precision >= 100 nanosec in system-timer, nanoseq==0.

  ulonglong tv=my_getsystime() + UUID_TIME_OFFSET + nanoseq;
  ...
  if (unlikely(tv == uuid_time))
  {
    nanoseq++;
    tv++;
  }
  else
  {
    if (nanoseq)
    {
      tv-=nanoseq;
      nanoseq=0;
    }
    DBUG_ASSERT(tv > uuid_time);
  }
  uuid_time=tv;

So if the first and the second calls end up in the same time-slice, we get tv++ (so the second UUID != first) and remember our fudging in nanoseq. We keep adding that offset (nanoseq, our "counting into the future") into all time (and keep increasing the offset we each subsequent timestamp) until we get a timestamp from the system that's larger than the last, at which point we try to "unwind" our counting into the future.

Suppose we managed to fit 6 UUID-gens into the same "time-slice" (my_getsystime() was the same for all six, let's say 0). To get different UUIDs, we needed different times, so we fudged all 6 UUIDs into different time-slices (the first UUIDs has the "real" time, the others have the subsequent time-stamps that we get by stealing 5 ticks from the future). This means that our time is still zero, but nanoseq is now 5 (our 6 timestamps are 0, 1, 2, 3, 4, 5); uuid_time (last timestamp we made) is also 5.

If my_getsystime() next gives us a sufficiently large value (say 20), all's fine: tv ::= time(20) + nanoseq(5) = 25; when we unwind, we "give back" the 5 ticks we "stole": tv-=nanoseq (25-5 -> 20); nanoseq=0; correct tv(20) does not collide with previous timestamp(uuid_time==5).

On the other hand, if we get a my_getsystime() t | (t > (uuid_time - nanoseq)) ^ ((t - uuid_time) < nanoseq), we seem to unwind into a region we've already used up:

(t=1 > (uuid_time=5 - nanoseq=5)) ^ ((t=1 - uuid_time=5) < nanoseq=5) => t
(tv was t + nanoseq == 1 + 5 == 6, unwind (-nanoseq) => tv=1, which we already used up)

(t=20 > (uuid_time=5 - nanoseq=5)) ^ ((t=20 - uuid_time=5) < nanoseq=5) => nil
(tv was 20 + 5 = 25, unwind gives tv= 25 - 5 = 20, which is still a good timestamp)

So unless I need more coffee, that seems to indicate that if current_time - last_time < nanoseq, we should not unwind (or at least not unwind all the way):

So

    if (nanoseq)
    {
      tv-=nanoseq;
      nanoseq=0;
    }
    DBUG_ASSERT(tv > uuid_time);

s/b

    if (nanoseq)
    {
      DBUG_ASSERT((tv > uuid_time) && (nanoseq > 0));
      long delta= min(nanoseq, tv - uuid_time -1);
      // -1 so we won't make tv=uuid_time for nanoseq >= (tv - uuid_time)
      tv-= delta;
      nanoseq-= delta;
    }

so we only unwind as much as we "legally" can without rolling back into a range we've already used?

Fun read: http://www.virtualdub.org/blog/pivot/entry.php?id=106
[10 Jul 2008 2:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/49379

2644 Tatiana A. Nurnberg	2008-07-10
      Bug#35848: UUID() returns UUIDs with the wrong time
      
      offset for time part in UUIDs was 1/1000 of what it
      should be. In other words, offset was off.
      
      Also handle the case where we count into the future
      when several UUIDs are generated in one "tick", and
      then the next call is late enough for us to unwind
      some but not all of those borrowed ticks.
      
      Lastly, handle the case where we keep borrowing and
      borrowing until the tick-counter overflows by also
      changing into a new "numberspace" by creating a new
      random suffix.
[10 Jul 2008 3:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/49381

2697 Tatiana A. Nurnberg	2008-07-10 [merge]
      Bug#35848: UUID() returns UUIDs with the wrong time
      
      offset for time part in UUIDs was 1/1000 of what it
      should be. In other words, offset was off.
      
      Also handle the case where we count into the future
      when several UUIDs are generated in one "tick", and
      then the next call is late enough for us to unwind
      some but not all of those borrowed ticks.
      
      Lastly, handle the case where we keep borrowing and
      borrowing until the tick-counter overflows by also
      changing into a new "numberspace" by creating a new
      random suffix.
[10 Jul 2008 3:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/49382

2704 Tatiana A. Nurnberg	2008-07-10 [merge]
      Bug#35848: UUID() returns UUIDs with the wrong time
      
      offset for time part in UUIDs was 1/1000 of what it
      should be. In other words, offset was off.
      
      Also handle the case where we count into the future
      when several UUIDs are generated in one "tick", and
      then the next call is late enough for us to unwind
      some but not all of those borrowed ticks.
      
      Lastly, handle the case where we keep borrowing and
      borrowing until the tick-counter overflows by also
      changing into a new "numberspace" by creating a new
      random suffix.
[16 Jul 2008 4:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/49799

2719 Tatiana A. Nurnberg	2008-07-16
      Bug#35848: UUID() returns UUIDs with the wrong time
      
      offset for time part in UUIDs was 1/1000 of what it
      should be. In other words, offset was off. (35848)
      
      Also handle the case where we count into the future
      when several UUIDs are generated in one "tick", and
      then the next call is late enough for us to unwind
      some but not all of those borrowed ticks.
      
      Lastly, handle the case where we keep borrowing and
      borrowing until the tick-counter overflows by also
      changing into a new "numberspace" by creating a new
      random suffix. (35514)
      
      Fix incorrect hyphenization of UUIDs. (38160, 6.0 only)
[21 Jul 2008 4:32] Tatiana Azundris Nuernberg
pushed to 5.0.68-bugteam, 5.1.28-bugteam, 6.0.7-bugteam, 1/3 in multi-CS Bug#35848, Bug#35514, Bug#38160
[28 Jul 2008 13:47] Bugs System
Pushed into 5.0.68  (revid:kpettersson@mysql.com-20080715152926-s5kgnqhtu93b3c8v) (pib:2)
(Retry automatic marking, to ensure nothing is missed. cm01)
[28 Jul 2008 14:45] Bugs System
Pushed into 6.0.7-alpha  (revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (version source revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (pib:3)
[28 Jul 2008 16:47] Bugs System
Pushed into 5.1.28  (revid:joerg@mysql.com-20080714105031-88hmr2baz5di9xej) (version source revid:joerg@mysql.com-20080714105031-88hmr2baz5di9xej) (pib:3)
[29 Jul 2008 6:55] Jon Stephens
Documented in the 5.0.68, 5.1.28, and 6.0.7 changelogs as follows:

        The UUID() function returned UUIDs with the wrong time; this was because 
        the offset for the time part in UUIDs was miscalculated.
[13 Sep 2008 20:09] Bugs System
Pushed into 6.0.6-alpha  (revid:azundris@mysql.com-20080716042138-nznefcoy7ljfm98z) (version source revid:sergefp@mysql.com-20080611231653-nmuqmw6dedjra79i) (pib:3)