Bug #105078 Queries using timestamp indexes and timezones return back the wrong data
Submitted: 29 Sep 2021 17:00 Modified: 17 Nov 2021 8:23
Reporter: Herman Lee Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.20, 8.0.26 OS:Any
Assigned to: CPU Architecture:Any

[29 Sep 2021 17:00] Herman Lee
Description:
The problem also reproduces on the latest 8.0.26. The server fails to return the correct data for queries with timezones.

How to repeat:
Set the server's system_time_zone to PDT.

CREATE TABLE IF NOT EXISTS `t1` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `data` int,
  KEY `ts` (`ts`),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO t1 (id, ts, data) VALUES (123456789, '2021-11-07 01:09:27-07:00', 1);

mysql> SELECT * FROM t1 FORCE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00';
Empty set (0.00 sec)

mysql> SELECT * FROM t1 FORCE INDEX (PRIMARY) WHERE ts >= '2021-11-07 01:09:22-07:00';
+-----------+---------------------+------+
| id        | ts                  | data |
+-----------+---------------------+------+
| 123456789 | 2021-11-07 01:09:27 |    1 |
+-----------+---------------------+------+
1 row in set (0.00 sec)
[30 Sep 2021 6:15] MySQL Verification Team
Hello Herman Lee,

Thank you for the report and test case.
Verified as described with 8.0.26 build.

regards,
Umesh
[30 Sep 2021 6:20] MySQL Verification Team
--

rm -rf 105078/
bin/mysqld --initialize-insecure --basedir=/home/umshastr/Downloads/mysql-8.0.26 --datadir=/home/umshastr/Downloads/mysql-8.0.26/105078 --log-error-verbosity=3
bin/mysqld_safe --no-defaults --basedir=/home/umshastr/Downloads/mysql-8.0.26 --datadir=/home/umshastr/Downloads/mysql-8.0.26/105078 --log-error-verbosity=3  --secure-file-priv=/tmp/ 2>&1 &

bin/mysql_tzinfo_to_sql /usr/share/zoneinfo |bin/mysql -uroot mysql

umshastr@umshastr-ubuntu:~/Downloads/mysql-8.0.26$ bin/mysql -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.26 MySQL Community Server - GPL

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show global variables like '%time_zone%';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| system_time_zone | PDT    |
| time_zone        | SYSTEM |
+------------------+--------+
2 rows in set (0.01 sec)

mysql> create database test;
Query OK, 1 row affected (0.05 sec)

mysql> use test
Database changed
mysql> CREATE TABLE IF NOT EXISTS `t1` (
    ->   `id` bigint unsigned NOT NULL AUTO_INCREMENT,
    ->   `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    ->   `data` int,
    ->   KEY `ts` (`ts`),
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.14 sec)

mysql> 
mysql> INSERT INTO t1 (id, ts, data) VALUES (123456789, '2021-11-07 01:09:27-07:00', 1);
Query OK, 1 row affected (0.02 sec)

mysql> select * from t1;
+-----------+---------------------+------+
| id        | ts                  | data |
+-----------+---------------------+------+
| 123456789 | 2021-11-07 01:09:27 |    1 |
+-----------+---------------------+------+
1 row in set (0.00 sec)

mysql> SELECT * FROM t1 FORCE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00';
Empty set (0.00 sec)

mysql> SELECT * FROM t1 FORCE INDEX (PRIMARY) WHERE ts >= '2021-11-07 01:09:22-07:00';
+-----------+---------------------+------+
| id        | ts                  | data |
+-----------+---------------------+------+
| 123456789 | 2021-11-07 01:09:27 |    1 |
+-----------+---------------------+------+
1 row in set (0.00 sec)

mysql> show global variables like '%time_zone%';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| system_time_zone | PDT    |
| time_zone        | SYSTEM |
+------------------+--------+
2 rows in set (0.00 sec)

mysql> system date +%Z
PDT
mysql> system cat docs/INFO_SRC
commit: f64af5bfff4c3a65eb44d9e513ca1c882850b030
date: 2021-07-01 07:17:38 +0200
build-date: 2021-07-01 07:54:49 +0000
short: f64af5bfff4
branch: mysql-8.0.26-release

MySQL source 8.0.26

mysql>
[17 Nov 2021 8:23] Vikram Gurram
The behavior mentioned in the bug in subject is expected and also documented. See analysis below.

mysql> show global variables like '%time_zone%';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| system_time_zone | PDT    |
| time_zone        | SYSTEM |
+------------------+--------+
2 rows in set (0.01 sec)

mysql> CREATE TABLE IF NOT EXISTS `t1` (
    ->   `id` bigint unsigned NOT NULL AUTO_INCREMENT,
    ->   `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
    -> CURRENT_TIMESTAMP,
    ->   `data` int,
    ->   KEY `ts` (`ts`),
    ->   PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.11 sec)

mysql> INSERT INTO t1 (id, ts, data) VALUES (1, '2021-11-07 01:09:27-07:00', 1);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO t1 (id, ts, data) VALUES (2, '2021-11-07 02:09:27-07:00', 1);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+----+---------------------+------+
| id | ts                  | data |
+----+---------------------+------+
|  1 | 2021-11-07 01:09:27 |    1 |
|  2 | 2021-11-07 01:09:27 |    1 |
+----+---------------------+------+
2 rows in set (0.00 sec)

Unique values in PDT are mapped to same values due to DST shift.

Under those conditions, the comparison in the WHERE clause occurs in different ways for non-indexed and indexed lookups and leads to different results:

In case of indexed lookup, the optimizer performs an index scan, first converting the search value from the session time zone to UTC, then comparing the result to the UTC index entries:

Conversion from session time zone to UTC:

mysql> SELECT CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC');
+----------------------------------------------------+
| CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC') |
+----------------------------------------------------+
| 2021-11-07 09:09:22                                |
+----------------------------------------------------+
1 row in set (0.00 sec)

Indexes contain the following UTC values :

mysql> set time_zone="UTC";
mysql> SELECT id, ts FROM t1;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  1 | 2021-11-07 08:09:27 |
|  2 | 2021-11-07 09:09:27 |
+----+---------------------+
2 rows in set (0.01 sec)

In this case, the (converted) search value is matched only to index entries, and because the index entries for the distinct stored UTC values are also distinct, the search value can match only one of them.

mysql> SELECT * FROM t1 FORCE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00';
+----+---------------------+------+
| id | ts                  | data |
+----+---------------------+------+
|  2 | 2021-11-07 01:09:27 |    1 |
+----+---------------------+------+
1 row in set (0.01 sec)

If there is no index or the optimizer cannot use it, comparisons occur in the session time zone. The optimizer performs a table scan in which it retrieves each ts column value, converts it from UTC to the session time zone, and compares it to the search value (also interpreted in the session time zone):

mysql> SELECT * FROM t1 FORCE INDEX (PRIMARY) WHERE ts >= '2021-11-07 01:09:22-07:00';
+----+---------------------+------+
| id | ts                  | data |
+----+---------------------+------+
|  1 | 2021-11-07 01:09:27 |    1 |
|  2 | 2021-11-07 01:09:27 |    1 |
+----+---------------------+------+
2 rows in set (0.00 sec)

Possible workarounds:
1. If it is important to return all ts values that match in the session time zone, the workaround is to suppress use of the index with an IGNORE INDEX hint:

mysql> SELECT * FROM t1 IGNORE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00';
+----+---------------------+------+
| id | ts                  | data |
+----+---------------------+------+
|  1 | 2021-11-07 01:09:27 |    1 |
|  2 | 2021-11-07 01:09:27 |    1 |
+----+---------------------+------+
2 rows in set (0.00 sec)

2. To avoid table scan, as a workaround, a functional index can be created.
Eg: 
 CREATE INDEX ts_idx ON t1 ((CONVERT_TZ(ts, 'SYSTEM', 'UTC'))); 
 SELECT * FROM t1 WHERE CONVERT_TZ(ts, 'SYSTEM', 'UTC') >=  
 CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC');

Ref: https://dev.mysql.com/doc/refman/8.0/en/timestamp-lookups.html

Since this is an expected behavior according the current code design, Closing it as 'not a bug'.