Bug #69067 MySQLCursor.executemany() doesn't like unicode operations
Submitted: 25 Apr 2013 12:06 Modified: 12 Mar 2014 0:52
Reporter: Dorian Kind Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S3 (Non-critical)
Version:1.0.9 OS:Any
Assigned to: Peeyush Gupta CPU Architecture:Any

[25 Apr 2013 12:06] Dorian Kind
Description:
When calling MySQLCursor.executemany(), passing a unicode instance as operation parameter will cause a UnicodeDecodeError when non-ascii characters exist in seq_params.

This happens because of line 437:

  values.append(fmt % self._process_params(params))

_process_params() returns UTF8-encoded strings per default, which cause the exception when they are used to format the unicode instance fmt.

How to repeat:
>> Create a test table:

CREATE TABLE `test_table` (
  `text` varchar(127) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

>> In Python, execute the following:

from mysql.connector import MySQLConnection
from mysql.connector.cursor import MySQLCursor

conn = MySQLConnection(user='test', password='test', database='test_db')
cur = MySQLCursor(connection=conn)

cur.executemany(u'INSERT INTO test_table (text) VALUES (%(text)s)', [{'text': u'non-ascii: éüäöø©'}])

>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 12: ordinal not in range(128)

Suggested fix:
Use a similar logic in executemany() as in execute(). In the latter function, the following check is applied:

try:
  if isinstance(operation, unicode):
    operation = operation.encode(self._connection.charset)
except (UnicodeDecodeError, UnicodeEncodeError), e:
  raise errors.ProgrammingError(str(e))

If this approach was adapted in executemany(), the error should no longer occur.
[6 Jun 2013 11:33] Geert Vanderkelen
Can't repeat using the below script, using 1.0.10:

# -*- coding: utf-8 -*-

import sys
sys.path.insert(0, 'python{0}/'.format(sys.version_info[0]))

import mysql.connector

cnx = mysql.connector.connect(user='root', database='test')
cur = cnx.cursor()

cur.execute("DROP TABLE IF EXISTS t1")
cur.execute("CREATE TABLE t1 (c1 VARCHAR(100)) CHARACTER SET 'utf8'")

data = [
	{ 'c1': u'database' },
	{ 'c1': u'データベース' },
    { 'c1': u'데이터베이스' },
    ]

stmt = "INSERT INTO t1 (c1) VALUES (%(c1)s)"
cur.executemany(stmt, data)
cnx.commit()

cur.execute("SELECT c1 FROM t1")
for row in cur:
    print(row[0])

cnx.close()
[26 Jun 2013 9:57] Dorian Kind
Hi Geert,

thanks for looking into this. If you change the line

> stmt = "INSERT INTO t1 (c1) VALUES (%(c1)s)"

to

> stmt = u"INSERT INTO t1 (c1) VALUES (%(c1)s)"

in your script, then the Exception will be thrown even in 1.0.10. As long as the operation is not a Unicode instance, it works.
[12 Mar 2014 0:52] Paul DuBois
Noted in 1.2.1 changelog.

MySQLCursor.executemany() caused a UnicodeDecodeError when non-ASCII
characters existed in the seq_params parameter and the operation was
a Unicode instance with Python 2. This is now corrected by encoding
the operation per the current connection character set.