Bug #30469 ADDDATE returns wrong results
Submitted: 17 Aug 2007 10:14 Modified: 30 Oct 2007 18:15
Reporter: Pascal Fellerich Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0, 5.1 OS:Any
Assigned to: Tatiana Azundris Nuernberg CPU Architecture:Any
Tags: date, interval

[17 Aug 2007 10:14] Pascal Fellerich
Description:
ADDDATE gets the INTERVAL wrong depending on the datatype. COnsider the following example (VMS timestamp to datetime conversion):

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL (45943388340000000/1e7) SECOND_MICROSECOND);

returns 1858-11-17 01:16:34.338834, which is obviously wrong. The correct result can be obtained when the '1e7' in the above expression is replaced by 10000000.

How to repeat:
select ADDDATE('1858-11-17 00:00:00', INTERVAL (45943388340000000/1e7) SECOND_MICROSECOND);
    => WRONG '1858-11-17 01:16:34.338834'

select ADDDATE('1858-11-17 00:00:00', INTERVAL (45943388340000000/10000000) SECOND_MICROSECOND);
    => CORRECT '2004-06-19 05:13:54'

select ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834 SECOND_MICROSECOND);
    => WRONG '1858-11-17 01:16:34.338834'

select ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834.0 SECOND_MICROSECOND);
    => CORRECT '2004-06-19 05:13:54'
[20 Aug 2007 15:40] Hartmut Holzgraefe
mysqltest test case

Attachment: bug30469.tgz (application/x-gtar, text), 867 bytes.

[20 Sep 2007 10:40] Pascal Fellerich
More oddities regarding the microseconds field:

select ADDDATE('1858-11-17', INTERVAL 4594338834.000150  SECOND_MICROSECOND);
 => 2004-06-19 05:13:54.000150 correct

select ADDDATE('1858-11-17', INTERVAL 4594338834.0001500  SECOND_MICROSECOND);
 => 2004-06-19 05:13:54.001500 - not ok, the microseconds have been multiplied by 10 !

select ADDDATE('1858-11-17', INTERVAL 4594338834.000150 * 1.000 SECOND_MICROSECOND);
 => 2004-06-19 05:13:54.150000 - not ok, 0.15 msec have turned into 150 msec...

It looks as if the fractional part is handled separately from the integer part internally. 

regards,
Pascal
[12 Oct 2007 13:56] Konstantin Osipov
Wrong results, upgrading.
[23 Oct 2007 12:30] Tatiana Azundris Nuernberg
item_timefunc.cc:
static bool get_interval_value()

    if (!(res=args->val_str(str_value)))
      return (1);

for
 SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL (45943388340000000/1e7) SECOND_MICROSECOND)
we see 4594338834;

for
 SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL (45943388340000000/10000000) SECOND_MICROSECOND);
we see 4594338834.0000

subsequently in
  case INTERVAL_SECOND_MICROSECOND:
    if (get_interval_info(str,length,cs,2,array,1))
      return (1);
    interval->second= array[0];
    interval->second_part= array[1];
we see 4594338834 in interval->second_part in the first case, and in interval->second in the second.

PS  Pascal Fellerich as in Atari ST?
[23 Oct 2007 13:10] Pascal Fellerich
> PS  Pascal Fellerich as in Atari ST?

Yes.
[23 Oct 2007 14:40] Tatiana Azundris Nuernberg
Actually, we can break this "directly", as well, if we're too lax on the format:

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834 SECOND_MICROSECOND);
| 1858-11-17 01:16:34.338834                                             | 

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834. SECOND_MICROSECOND);
| 1858-11-17 01:16:34.338834                                              | 

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834.0 SECOND_MICROSECOND);
| 2004-06-19 05:13:54                                                      | 

Please note that the docs have this to say:

- if an interval has several segments (hours, seconds, microseconds, ...), those are grouped using delimiters (':', '.', ...). The *suggested* microsecond-delimiter *looks* like a decimal point.

- if we "announce" more segments (with the interval type, SECOND_MICROSEND would announce two segments, seconds and micro-seconds), but fail to provide that many, the ones we do give are filled in from the *right*, that is, in the least significant places.

=> consequently,

select ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834 SECOND_MICROSECOND);
   '1858-11-17 01:16:34.338834'

select ADDDATE('1858-11-17 00:00:00', INTERVAL 4594338834.0 SECOND_MICROSECOND);
   '2004-06-19 05:13:54'

is "expected behaviour." Or well, at least it's documented behaviour:

ADDDATE('1858-11-17 00:00:00', INTERVAL '1.0' HOUR_MINUTE)
| 1858-11-17 01:00:00                                        | 

select ADDDATE('1858-11-17 00:00:00', INTERVAL '60' HOUR_MINUTE);
| 1858-11-17 01:00:00                                       | 

mysql> select ADDDATE('1858-11-17 00:00:00', INTERVAL '0:60' HOUR_MINUTE);
| 1858-11-17 01:00:00                                         | 

mysql> select ADDDATE('1858-11-17 00:00:00', INTERVAL '0.61' HOUR_MINUTE);
| 1858-11-17 01:01:00                                         | 

mysql> select ADDDATE('1858-11-17 00:00:00', INTERVAL '.61' HOUR_MINUTE);
| 1858-11-17 01:01:00                                        | 

(This also means that we cannot even limit microseconds to 6 places as per previous entry, as that would change existing behaviour. You can, after all, add more than 24 hours to a date, so why not more than 1,000,000 microseconds. As those intervals aren't actual decimals, each segment is currently independent and can "overflow" into the higher segments: MINUTE_SECOND '1:20' is equivalent to '0:80' or even just ':80' or '80'.)

http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_date-add

By that token, the "correct/wrong" notes should be the other way around, but all things considered, interval takes a string, anyway.

So,

- The underlying issue is that
  select 45943388340000000/1e7 => 4594338834
  select 45943388340000000/1000000 => 45943388340.0000
  This it somewhat unexpected, but not a bug per se, as the
  results are numerically equivalent. As strings however, and
  moreover as strings given to INTERVAL, the former is a
  least significant segment, while the latter is two segments
  with a delimiter that happens to a period.

- "Expected behaviour" can be forced using a cast along these lines (different examples for different expectations):

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL CAST(40000000/1e7 AS signed) SECOND_MICROSECOND);
| 1858-11-17 00:00:00.000004

SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL CAST(40000000/1e7 AS DECIMAL(12,6)) SECOND_MICROSECOND);
| 1858-11-17 00:00:04
[26 Oct 2007 7:49] Tatiana Azundris Nuernberg
Tatjana says:

SUMMARY

  INTERVAL takes an argument that can *look* like a float (and be
  indeed the result of a division implicitly cast to string), but
  doesn't at all behave like one.  INTERVAL 6/4 HOUR_MINUTE gives
  '1.5000', which is considered '1 hour, 5000 minutes'.
  INTERVAL 1 HOUR_MINUTE gives '1', which is considered '1 minute'
  (as opposed to one hour).  So 1 != 1.0, and 1.5000 is neither
  1-1/2 nor {1;5}, it's {1;5000}.  This isn't exactly intuitive
  to the casual user.  Numeric values can't quite behave like
  string values here as they can have just one separator.  They
  also don't behave like floats because 1.500 != 1.5, and 1.0 != 1.

DATE_ADD(date,INTERVAL expr unit)

 "expr is an expression specifying the interval value to be added or
  subtracted from the starting date. expr is a STRING; it may start with
  a '-' for negative intervals."

  http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_date-add

  Furthermore, if 'unit' has more several segments (e.g. HOUR_MINUTE),
  but 'expr' does not provide as many as segments as 'unit' suggests,
  they're filled in *right-to-left* ('20' HOUR_MINUTE gives us 20
  minutes, not 20 hours).  Also '1.5' HOUR_MINUTE are 65 minutes,
  not 90, because the '.' is a separator, not a decimal point.

  To make things more complicated, intervals can have many segments:
  DAY_SECOND => 'DAYS HOURS:MINUTES:SECONDS'.

  If more segments are given than announced by 'unit', they are filled
  *left-to-right*:

    SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL '5:1' SECOND)

      => 1858-11-17 00:00:05

  It is currently possible to hand in a numeral:

    select 45943388340000000/1e7 => 4594338834

    select 45943388340000000/10000000 => 4594338834.0000

  While these are numerically equal, as an argument to HOUR_MINUTE,
  the first would mean (0 hours and) 4594338834 minutes, while the
  latter would create an interval of 4594338834 hours (and 0 minutes).
  This of course happens because we treat them as strings, which in
  the former case gives one numeric value, and in the latter "two
  numeric values separated by a delimiter that happens to be a period."

  And worse than that:

    SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL 6/2 HOUR_MINUTE)

    => 1858-11-17 03:00:00  ('3.0000' -> 3 hours, 0 minutes)

  but

    SELECT ADDDATE('1858-11-17 00:00:00', INTERVAL 6/4 HOUR_MINUTE)

    => 1858-11-20 12:20:00  ('1.5000' -> 1 hour, 5000 minutes!!)

  So what do we do?

  a)  Only use the integer part, put it into the least significant
      segment?  Then 1.5 would be equal to '1'.

  b)  Always treat as float?  Then 1 would be equal to '1.0', and
      2.5 would be equal '2.5' (in a one-segment unit, they'd be 2,
      in two-segments 2:5 -- two of one and five of the other, as
      opposed to 2-1/2.  This might confuse the casual user).
      Obviously, only up to two segments could be set this way,
      as the float will have but one period.  Also, what number of
      decimals would we assume?

  c)  Do the math.  2.5 HOUR => 150 minutes.  Is 2.5 DAY_HOUR
      still 150 minutes though?  Or 2 days, 5 hours?  Or 2.5 days?
      How about DAY_SECOND?

  d)  Throw an error on non-integers.

  e)  Throw an error on non-strings.

  f)  leave the user to it, have them CAST as needed (to SIGNED or
      DECIMAL or whatever they find intuitive).

Peter Gulutzan says:

() The manual says:
   "expr is a string".
   Inside a string, '.' does not mean "decimal point".
   It commonly is used for punctuation.
() The manual says:
   "MySQL allows any punctuation delimiter in the expr format."
   So punctuation characters are delimiters.
   And "1.5 hour_minute" must be treated the same as
   "1:5 hour_minute".

So '1.5000' is 1 hour + 5000 minutes -- correct.

Why then is 1.5 DAY_SECOND = 1 minute + 5 seconds,
and not 1 day + 5 hours + 0 minutes + 0 seconds?
Because:
() The manual says:
   "If you specify an interval value that is too short
   (does not include all the interval parts that would
   be expected from the unit keyword), MySQL assumes that
   you have left out the leftmost parts of the interval
   value."
   That's explicit. You left out the leftmost,
   not the rightmost. So it's the same as 0.0.1.5.

So MySQL behaviour is compatible with the manual.

Bug#30469 is not a bug.
[30 Oct 2007 18:15] Paul DuBois
I'll add some discussion to the DATE_ADD() function description to emphasize that expr is treated as a string and show an example how this might produce a non-intuitive result.