Bug #99575 insert into t (c1,...) values(%s,...) as v on duplicate key update c1=v.c1 slow
Submitted: 14 May 2020 15:26 Modified: 9 Feb 2023 20:02
Reporter: Julian Briggs Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S3 (Non-critical)
Version:8.0.20, 8.0.25 OS:Ubuntu (20.04)
Assigned to: CPU Architecture:ARM

[14 May 2020 15:26] Julian Briggs
Description:
Very slow executemany("insert into t (c1,...) ... on duplicate key update ...", data)
data=[(0,1,2,3...),]
with >24 columns 
when using the new row alias syntax:
VALUES(%s,%s,%s,...) as v on duplicate key update c1=v.c1, ...
(Old syntax on duplicate key update c1=VALUES(c1) is deprecated from 8.0.20)

Run time doubles with each extra column
n cols 21
Run time: 1.0900123119354248
n cols 22
Run time: 2.0359716415405273
n cols 23
Run time: 4.018187046051025
n cols 24
Run time: 8.130314826965332
n cols 25
Run time: 15.858994007110596

I expected similar runtime (under 1 second) to executemany with statement in old syntax.

mysql-connector-python-py3 8.0.20-1ubuntu20.04 all MySQL database driver written in pure Python 3
mysql-client-8.0 8.0.20-0ubuntu0.20.04.1 amd64 MySQL database client binaries
mysql-community-server 8.0.20-1ubuntu18.04 amd64 MySQL Server

How to repeat:
statement="insert into test.mysql_on_dup_bug (id,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17,c18,c19,c20,c21,c22,c23,c24) 
values (%s ,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s) as v 
on duplicate key update c0=v.c0,c1=v.c1,c2=v.c2,c3=v.c3,c4=v.c4,c5=v.c5,c6=v.c6,c7=v.c7,c8=v.c8,c9=v.c9,c10=v.c10,c11=v.c11,
c12=v.c12,c13=v.c13,c14=v.c14,c15=v.c15,c16=v.c16,c17=v.c17,c18=v.c18,c19=v.c19,c20=v.c20,c21=v.c21,c22=v.c22,c23=v.c23,c24=v.c24",[(1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)])

cursor.executemany(statement, data)
Run time: 15.914669752120972

Suggested fix:
Make performance with new alias syntax, eg:
c1=v.c1  

match that with old VALUES syntax, eg
c1=VALUES(c1,...)
[14 May 2020 15:32] Julian Briggs
Python script to demonstrate the bug.

Attachment: mysql_on_dup_bug.py (text/x-python), 4.86 KiB.

[15 May 2020 10:20] MySQL Verification Team
Hello Julian Briggs,

Thank you for the report and test case.

regards,
Umesh
[9 Feb 2023 20:02] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Connector/Python 8.0.33 release, and here's the proposed changelog entry from the documentation team:

Executing executemany() with insert statements that used 'on duplicate
key update' demonstrated poor performance; improved speed by fixing the
associated regular expressions that match insert statements.

Thank you for the bug report.