Bug #105078 | Queries using timestamp indexes and timezones return back the wrong data | ||
---|---|---|---|
Submitted: | 29 Sep 2021 17:00 | Modified: | 17 Nov 2021 8:23 |
Reporter: | Herman Lee | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 8.0.20, 8.0.26 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[29 Sep 2021 17:00]
Herman Lee
[30 Sep 2021 6:15]
MySQL Verification Team
Hello Herman Lee, Thank you for the report and test case. Verified as described with 8.0.26 build. regards, Umesh
[30 Sep 2021 6:20]
MySQL Verification Team
-- rm -rf 105078/ bin/mysqld --initialize-insecure --basedir=/home/umshastr/Downloads/mysql-8.0.26 --datadir=/home/umshastr/Downloads/mysql-8.0.26/105078 --log-error-verbosity=3 bin/mysqld_safe --no-defaults --basedir=/home/umshastr/Downloads/mysql-8.0.26 --datadir=/home/umshastr/Downloads/mysql-8.0.26/105078 --log-error-verbosity=3 --secure-file-priv=/tmp/ 2>&1 & bin/mysql_tzinfo_to_sql /usr/share/zoneinfo |bin/mysql -uroot mysql umshastr@umshastr-ubuntu:~/Downloads/mysql-8.0.26$ bin/mysql -uroot Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.26 MySQL Community Server - GPL Copyright (c) 2000, 2021, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show global variables like '%time_zone%'; +------------------+--------+ | Variable_name | Value | +------------------+--------+ | system_time_zone | PDT | | time_zone | SYSTEM | +------------------+--------+ 2 rows in set (0.01 sec) mysql> create database test; Query OK, 1 row affected (0.05 sec) mysql> use test Database changed mysql> CREATE TABLE IF NOT EXISTS `t1` ( -> `id` bigint unsigned NOT NULL AUTO_INCREMENT, -> `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -> `data` int, -> KEY `ts` (`ts`), -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Query OK, 0 rows affected (0.14 sec) mysql> mysql> INSERT INTO t1 (id, ts, data) VALUES (123456789, '2021-11-07 01:09:27-07:00', 1); Query OK, 1 row affected (0.02 sec) mysql> select * from t1; +-----------+---------------------+------+ | id | ts | data | +-----------+---------------------+------+ | 123456789 | 2021-11-07 01:09:27 | 1 | +-----------+---------------------+------+ 1 row in set (0.00 sec) mysql> SELECT * FROM t1 FORCE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00'; Empty set (0.00 sec) mysql> SELECT * FROM t1 FORCE INDEX (PRIMARY) WHERE ts >= '2021-11-07 01:09:22-07:00'; +-----------+---------------------+------+ | id | ts | data | +-----------+---------------------+------+ | 123456789 | 2021-11-07 01:09:27 | 1 | +-----------+---------------------+------+ 1 row in set (0.00 sec) mysql> show global variables like '%time_zone%'; +------------------+--------+ | Variable_name | Value | +------------------+--------+ | system_time_zone | PDT | | time_zone | SYSTEM | +------------------+--------+ 2 rows in set (0.00 sec) mysql> system date +%Z PDT mysql> system cat docs/INFO_SRC commit: f64af5bfff4c3a65eb44d9e513ca1c882850b030 date: 2021-07-01 07:17:38 +0200 build-date: 2021-07-01 07:54:49 +0000 short: f64af5bfff4 branch: mysql-8.0.26-release MySQL source 8.0.26 mysql>
[17 Nov 2021 8:23]
Vikram Gurram
The behavior mentioned in the bug in subject is expected and also documented. See analysis below. mysql> show global variables like '%time_zone%'; +------------------+--------+ | Variable_name | Value | +------------------+--------+ | system_time_zone | PDT | | time_zone | SYSTEM | +------------------+--------+ 2 rows in set (0.01 sec) mysql> CREATE TABLE IF NOT EXISTS `t1` ( -> `id` bigint unsigned NOT NULL AUTO_INCREMENT, -> `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE -> CURRENT_TIMESTAMP, -> `data` int, -> KEY `ts` (`ts`), -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Query OK, 0 rows affected (0.11 sec) mysql> INSERT INTO t1 (id, ts, data) VALUES (1, '2021-11-07 01:09:27-07:00', 1); Query OK, 1 row affected (0.01 sec) mysql> INSERT INTO t1 (id, ts, data) VALUES (2, '2021-11-07 02:09:27-07:00', 1); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM t1; +----+---------------------+------+ | id | ts | data | +----+---------------------+------+ | 1 | 2021-11-07 01:09:27 | 1 | | 2 | 2021-11-07 01:09:27 | 1 | +----+---------------------+------+ 2 rows in set (0.00 sec) Unique values in PDT are mapped to same values due to DST shift. Under those conditions, the comparison in the WHERE clause occurs in different ways for non-indexed and indexed lookups and leads to different results: In case of indexed lookup, the optimizer performs an index scan, first converting the search value from the session time zone to UTC, then comparing the result to the UTC index entries: Conversion from session time zone to UTC: mysql> SELECT CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC'); +----------------------------------------------------+ | CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC') | +----------------------------------------------------+ | 2021-11-07 09:09:22 | +----------------------------------------------------+ 1 row in set (0.00 sec) Indexes contain the following UTC values : mysql> set time_zone="UTC"; mysql> SELECT id, ts FROM t1; +----+---------------------+ | id | ts | +----+---------------------+ | 1 | 2021-11-07 08:09:27 | | 2 | 2021-11-07 09:09:27 | +----+---------------------+ 2 rows in set (0.01 sec) In this case, the (converted) search value is matched only to index entries, and because the index entries for the distinct stored UTC values are also distinct, the search value can match only one of them. mysql> SELECT * FROM t1 FORCE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00'; +----+---------------------+------+ | id | ts | data | +----+---------------------+------+ | 2 | 2021-11-07 01:09:27 | 1 | +----+---------------------+------+ 1 row in set (0.01 sec) If there is no index or the optimizer cannot use it, comparisons occur in the session time zone. The optimizer performs a table scan in which it retrieves each ts column value, converts it from UTC to the session time zone, and compares it to the search value (also interpreted in the session time zone): mysql> SELECT * FROM t1 FORCE INDEX (PRIMARY) WHERE ts >= '2021-11-07 01:09:22-07:00'; +----+---------------------+------+ | id | ts | data | +----+---------------------+------+ | 1 | 2021-11-07 01:09:27 | 1 | | 2 | 2021-11-07 01:09:27 | 1 | +----+---------------------+------+ 2 rows in set (0.00 sec) Possible workarounds: 1. If it is important to return all ts values that match in the session time zone, the workaround is to suppress use of the index with an IGNORE INDEX hint: mysql> SELECT * FROM t1 IGNORE INDEX (ts) WHERE ts >= '2021-11-07 01:09:22-07:00'; +----+---------------------+------+ | id | ts | data | +----+---------------------+------+ | 1 | 2021-11-07 01:09:27 | 1 | | 2 | 2021-11-07 01:09:27 | 1 | +----+---------------------+------+ 2 rows in set (0.00 sec) 2. To avoid table scan, as a workaround, a functional index can be created. Eg: CREATE INDEX ts_idx ON t1 ((CONVERT_TZ(ts, 'SYSTEM', 'UTC'))); SELECT * FROM t1 WHERE CONVERT_TZ(ts, 'SYSTEM', 'UTC') >= CONVERT_TZ('2021-11-07 01:09:22', 'SYSTEM', 'UTC'); Ref: https://dev.mysql.com/doc/refman/8.0/en/timestamp-lookups.html Since this is an expected behavior according the current code design, Closing it as 'not a bug'.