Skip to content

Commit de7e142

Browse files
author
Arun Kuruvila
committed
Bug#27970159: DUPLICATE ENTRY FOR KEY '<GROUP_KEY>' ERROR
Description: Query involving GROUP BY on timestamp column results in duplicate entry for key (ER_DUP_ENTRY) error. Analysis: The problem arises when timestamp values are inserted to a table with one timezone and these values are fetched later with a new timezone, such that the inserted timestamp values falls in the DST switch hour in the new timezone. Same timestamp value can exist twice during the DST switching hour. Eg: 2019-10-27 03:30 in EET timezone where DST ends on 27th October 2019 at 04:00. Time 03:30 appears 1st during DST hour and the same time appears again after the clock is switched back to 03:00 on 2019-10-27 at 4:00. Lets consider the following example:- CREATE TABLE t1 (c1 TIMESTAMP); SET TIME_ZONE = '+00:00'; INSERT INTO t1 VALUES ('2019-10-27 00:30:00'); INSERT INTO t1 VALUES ('2019-10-27 00:30:00'); INSERT INTO t1 VALUES ('2019-10-27 01:30:00'); INSERT INTO t1 VALUES ('2019-10-27 01:30:00'); SET TIME_ZONE = 'SYSTEM'; Table t1 is inserted with 4 timestamp values and later the timezone is changed to EET (system timezone). Once timezone is changed to EET, both the values 00:30:00 and 01:30:00 map to the same hour, 03:30:00, in EET timezone due to the DST switching in EET on 2019-10-27 at 4:00. mysql> SELECT * FROM t1; +---------------------+ | c1 | +---------------------+ | 2019-10-27 03:30:00 | | 2019-10-27 03:30:00 | | 2019-10-27 03:30:00 | | 2019-10-27 03:30:00 | +---------------------+ 4 rows in set (0.00 sec) Now lets consider a query with GROUP BY on the TIMESTAMP column, c1 :- SELECT COUNT(*) FROM t1 GROUP BY c1; As part of the above query execution, these timestamp values in binary representation are fetched from the table where it is stored as 00:30:00 and 01:30:00 in the binary format, B1 and B2. However all 4 values map to the same timestamp value, 03:30, in the current timezone (EET). When converted to its respective binary format, all 4 values get the same binary representation B2. As part of grouping the data for GROUP BY execution, optimizer creates hash map with a key on the GROUP BY field values, i.e, the binary representation of the timestamp values fetched from the table. As part of grouping, 1st row ('2019-10-27 00:30:00') is encountered with key as B1. Since its the 1st occurrence of B1, the record is inserted into the hash. But this record inserted is the binary representation of the timestamp value in EET timezone, i.e, B2. Then the 2nd row ('2019-10-27 00:30:00') with same key B1 is encountered. The hash search with key as B1 cannot fetch the record in the hash since the 1st record was inserted as B2. Due to this, the record is again inserted into the hash and results in a duplicate key error. Fix:- When server wants to return DUPLICATE ENTRY FOR KEY '<GROUP_KEY>' ERROR, if the grouping involves a TIMESTAMP field, throw an error: "Grouping on temporal is non-deterministic for time zones having DST. Please consider switching to UTC for this query." Approved by: Guilhem Bichot <guilhem.bichot@oracle.com> Approved by: Jens Even Blomsoy <jens.even.blomsoy@oracle.com> Change-Id: Ic15f04ea9e9b3c04bcc1ec97e47172726a082f9b
1 parent 729aab6 commit de7e142

File tree

4 files changed

+66
-1
lines changed

4 files changed

+66
-1
lines changed

mysql-test/r/timezone.result

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,3 +45,20 @@ unix_timestamp('2038-01-19 04:14:07'),
4545
unix_timestamp('2038-01-19 04:14:08');
4646
unix_timestamp('1970-01-01 01:00:00') unix_timestamp('1970-01-01 01:00:01') unix_timestamp('2038-01-19 04:14:07') unix_timestamp('2038-01-19 04:14:08')
4747
0 1 2147483647 0
48+
CREATE TABLE t1 (c1 TIMESTAMP);
49+
SET TIME_ZONE = '+00:00';
50+
INSERT INTO t1 VALUES('2019-10-27 00:47:42'), ('2019-10-27 00:47:42');
51+
INSERT INTO t1 VALUES('2019-10-27 01:47:42'), ('2019-10-27 01:47:42');
52+
SET TIME_ZONE = 'SYSTEM';
53+
SELECT DISTINCT c1 FROM t1;
54+
c1
55+
2019-10-27 02:47:42
56+
SELECT COUNT(*) FROM t1 GROUP BY c1;
57+
ERROR HY000: Grouping on temporal is non-deterministic for timezones having DST. Please consider switching to UTC for this query.
58+
SET sql_mode='';
59+
Warnings:
60+
Warning 3090 Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release.
61+
SELECT COUNT(*) FROM t1 GROUP BY c1;
62+
ERROR HY000: Grouping on temporal is non-deterministic for timezones having DST. Please consider switching to UTC for this query.
63+
SET sql_mode=DEFAULT;
64+
DROP TABLE t1;

mysql-test/t/timezone.test

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,28 @@ select unix_timestamp('1970-01-01 01:00:00'),
6262
unix_timestamp('2038-01-19 04:14:08');
6363

6464
# End of 4.1 tests
65+
66+
#
67+
# Bug #27970159: DUPLICATE ENTRY FOR KEY '<GROUP_KEY>' ERROR
68+
#
69+
70+
CREATE TABLE t1 (c1 TIMESTAMP);
71+
SET TIME_ZONE = '+00:00';
72+
73+
INSERT INTO t1 VALUES('2019-10-27 00:47:42'), ('2019-10-27 00:47:42');
74+
INSERT INTO t1 VALUES('2019-10-27 01:47:42'), ('2019-10-27 01:47:42');
75+
76+
SET TIME_ZONE = 'SYSTEM';
77+
78+
SELECT DISTINCT c1 FROM t1;
79+
80+
--error ER_GROUPING_ON_TIMESTAMP_IN_DST
81+
SELECT COUNT(*) FROM t1 GROUP BY c1;
82+
83+
SET sql_mode='';
84+
--error ER_GROUPING_ON_TIMESTAMP_IN_DST
85+
SELECT COUNT(*) FROM t1 GROUP BY c1;
86+
SET sql_mode=DEFAULT;
87+
88+
#Cleanup
89+
DROP TABLE t1;

sql/share/errmsg-utf8.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7774,6 +7774,9 @@ ER_XA_REPLICATION_FILTERS
77747774
ER_CANT_OPEN_ERROR_LOG
77757775
eng "Could not open file '%s' for error logging%s%s"
77767776

7777+
ER_GROUPING_ON_TIMESTAMP_IN_DST
7778+
eng "Grouping on temporal is non-deterministic for timezones having DST. Please consider switching to UTC for this query."
7779+
77777780
#
77787781
# End of 5.7 error messages.
77797782
#

sql/sql_executor.cc

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
1+
/* Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
22
33
This program is free software; you can redistribute it and/or modify
44
it under the terms of the GNU General Public License as published by
@@ -3506,6 +3506,26 @@ end_update(JOIN *join, QEP_TAB *const qep_tab, bool end_of_records)
35063506
init_tmptable_sum_functions(join->sum_funcs);
35073507
if ((error=table->file->ha_write_row(table->record[0])))
35083508
{
3509+
/*
3510+
If the error is HA_ERR_FOUND_DUPP_KEY and the grouping involves a
3511+
TIMESTAMP field, throw a meaningfull error to user with the actual
3512+
reason and the workaround. I.e, "Grouping on temporal is
3513+
non-deterministic for timezones having DST. Please consider switching
3514+
to UTC for this query". This is a temporary measure until we implement
3515+
WL#13148 (Do all internal handling TIMESTAMP in UTC timezone), which
3516+
will make such problem impossible.
3517+
*/
3518+
if (error == HA_ERR_FOUND_DUPP_KEY)
3519+
{
3520+
for (group=table->group ; group ; group=group->next)
3521+
{
3522+
if (group->field->type() == MYSQL_TYPE_TIMESTAMP)
3523+
{
3524+
my_error(ER_GROUPING_ON_TIMESTAMP_IN_DST, MYF(0));
3525+
DBUG_RETURN(NESTED_LOOP_ERROR);
3526+
}
3527+
}
3528+
}
35093529
if (create_ondisk_from_heap(join->thd, table,
35103530
tmp_tbl->start_recinfo,
35113531
&tmp_tbl->recinfo,

0 commit comments

Comments
 (0)