Bug #101678 TIMESTAMP column not matching datetime literal with explicit time zone offset
Submitted: 19 Nov 2020 11:02 Modified: 10 May 2021 4:59
Reporter: Mattias Jonsson (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:8.0.22 OS:Any
Assigned to: CPU Architecture:Any

[19 Nov 2020 11:02] Mattias Jonsson
Description:
I expect the new datetime literal with explicit time zone offset to give a deterministic timestamp (epoch or UTC date time) regardless of time_zone the session has configured (@@time_zone).

In 'CET' time_zone (from the mtr test suite) and 'Europe/Amsterdam' SYSTEM time zone the explicit time zone offset is not usable since it does not deliver expected results.

1) It does not return the matching rows
2) The warning in explain containing the generated query after parsing, does not include the explicit time zone offset, making the timestamp non-deterministic.

How to repeat:
Build MySQL 8.0.22 and start it with './mtr --start 1st' and then connect a mysql client executing the below:

drop table if exists ts;
select "Using time_zone as UTC, to be 'safe' from local time conversion (conversions may still be done but only within UTC)" as 'STATUS';
set time_zone = '+00:00';
select @@version, @@time_zone;
create table ts (id int unsigned not null auto_increment primary key, ts timestamp);

insert into ts (ts) values
('2020-10-24 23:40:00'),
('2020-10-25 00:00:00'),
('2020-10-25 00:20:00'),
('2020-10-25 00:40:00'),
('2020-10-25 01:00:00'),
('2020-10-25 01:20:00'),
('2020-10-25 01:40:00'),
('2020-10-25 02:00:00'),
('2020-10-25 02:20:00');

show create table ts;

select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts;
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
show warnings;
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
show warnings;
set time_zone = 'CET';
select @@version, @@time_zone;
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts;
select "id 4 does not match the WHERE clause!";
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
select "Where is id 5 & 6?";
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
show warnings;
select "In the warning above, the generated query has not adjusted for the timezone offset, and how could we know what epoch/time TIMESTAMP'2020-10-25 02:20:00' means?";
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
show warnings;
select "In the warning above, the generated query has not adjusted for the timezone offset, and how could we know what epoch/time TIMESTAMP'2020-10-25 02:20:00' means?";
set time_zone = '+00:00';
select @@version, @@time_zone;
alter table ts add index (ts);
show create table ts;
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts;
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
show warnings;
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
show warnings;
set time_zone = 'CET';
select @@version, @@time_zone;
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts;
select "id 4 does not match the WHERE clause!";
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
select "Where is id 5 & 6?";
select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
show warnings;
select "In the warning above, the generated query has not adjusted for the timezone offset, and how could we know what epoch/time TIMESTAMP'2020-10-25 02:20:00' means?";
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+01:00';
explain select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
show warnings;
explain analyze select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS DATETIME) as `utc_time` from ts where ts > '2020-10-25 02:20:00+02:00';
select "In the warning above, the generated query has not adjusted for the timezone offset, and how could we know what epoch/time TIMESTAMP'2020-10-25 02:20:00' means?";
drop table ts;
set time_zone = 'CET';

Suggested fix:
I assume that moving between epoch (or UTC time zone) internally (since the timestamp is stored as epoch) and parsed query (which only have INT or DATETIME) there are several cases where the time zone offset is lost or not taken into account. All these needs to be fixed.

Looking at the EXPLAIN ANALYZE output, it seems like both the filter on TIMESTAMP as well as index lookup fails to use the given time zone offset.

This is *not* a duplicate of bug#38455, since we now have given a deterministic time specification, and would expect the correct result (even with presentation not being deterministic for the timestamp column, which is a separate issue).
[19 Nov 2020 11:04] Mattias Jonsson
Results from how to repeat

Attachment: bug101678.result (application/octet-stream, text), 46.75 KiB.

[19 Nov 2020 11:31] MySQL Verification Team
Hello Mattias,

Thank you for the report and feedback!

regards,
Umesh
[19 Nov 2020 12:29] Mattias Jonsson
Since this is getting the wrong results, I consider it as S2 (Serious) severity.
[10 May 2021 4:59] Maheedhar Panchalamarri Venkata
set time_zone = '+00:00';
create table ts (id int unsigned not null auto_increment primary key, ts
timestamp);
insert into ts (ts) values
('2020-10-24 23:40:00'),
('2020-10-25 00:00:00'),
('2020-10-25 00:20:00'),
('2020-10-25 00:40:00'),
('2020-10-25 01:00:00'),
('2020-10-25 01:20:00'),
('2020-10-25 01:40:00'),
('2020-10-25 02:00:00'),
('2020-10-25 02:20:00');

The above inserts cover the repeat hour at the end of DST for CET.

SELECT id, ts AS CET, cast(ts AT TIME ZONE '+00:00' AS DATETIME) AS UTC
FROM ts;
+----+---------------------+---------------------+
| id | CET                 | UTC                 |
+----+---------------------+---------------------+
|  1  | 2020-10-25 01:40:00 | 2020-10-24 23:40:00 |
|  2  | 2020-10-25 02:00:00 | 2020-10-25 00:00:00 |
|  3  | 2020-10-25 02:20:00 | 2020-10-25 00:20:00 |
|  4  | 2020-10-25 02:40:00 | 2020-10-25 00:40:00 |
|  5  | 2020-10-25 02:00:00 | 2020-10-25 01:00:00 |
|  6  | 2020-10-25 02:20:00 | 2020-10-25 01:20:00 |
|  7  | 2020-10-25 02:40:00 | 2020-10-25 01:40:00 |
|  8  | 2020-10-25 03:00:00 | 2020-10-25 02:00:00 |
|  9  | 2020-10-25 03:20:00 | 2020-10-25 02:20:00 |
+----+---------------------+---------------------+

The rows 2,3,4 & 5,6,7 are the repeat hours (CET). In

1. select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS
   DATETIME) as `utc_time` from ts where ts > '2020-10-25
   02:20:00+01:00';

the datetime literal translates to '2020-10-25 01:20' UTC(row id 6)
which is then converted to the current timezone(CET) '2020-10-25 02:20'
before the comparing. This results in selecting rows 4,7,8,9. This is
the repeat hour of the DST change.

2. select *, unix_timestamp(ts), cast(ts AT TIME ZONE '+00:00' AS
   DATETIME) as `utc_time` from ts where ts > '2020-10-25
   02:20:00+02:00';

the datetime literal translates to '2020-10-25 00:20' UTC(row id 3)
which is then converted to the current timezone(CET) which also
translates to '2020-10-25 02:20'. Hence the same result of rows 4,7,8,9.

This appears to be not taking offset into account but as it falls into
the repeat hour of the DST change, it is working as expected.