Bug #82948 LookupError: unknown encoding: utf8mb4
Submitted: 12 Sep 2016 14:12 Modified: 12 Jul 2017 15:32
Reporter: Robert K. Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S2 (Serious)
Version:2.1.3 & 2.2 OS:Any
Assigned to: CPU Architecture:Any

[12 Sep 2016 14:12] Robert K.
Description:
When using a connection with charset "utf8mb4" inserting multiple rows into a table will fail with an LookupError: unknown encoding: utf8mb4

  File "C:\Python34-32\lib\site-packages\mysql\connector\cursor.py", line 616, in executemany
    stmt = self._batch_insert(operation, seq_params)
  File "C:\Python34-32\lib\site-packages\mysql\connector\cursor.py", line 546, in _batch_insert
    fmt = matches.group(1).encode(self._connection.charset)

The main problem is that the mysql-connector uses "self._connection.charset" instead of "self._connection.python_charset".

How to repeat:
Connect to a MySQL database using charset=utf8mb4.

Insert multiple rows using the function MySQLCursor.executemany.

The inserted data should contain umlauts or other utf-8 characters.

Suggested fix:
Change the line 546 in mysql\connector\cursor.py

fmt = matches.group(1).encode(self._connection.charset)

to 

fmt = matches.group(1).encode(self._connection.python_charset)

Note: 4 lines below is a similar call to operation.encode(self._connection.charset)". I assume this call should be changed, too.
[13 Sep 2016 12:15] Chiranjeevi Battula
Hello  Robert,

Thank you for the bug report.
Verified this behavior on MySQL Connector/Python 2.1.3 and 2.2.0.

Thanks,
Chiranjeevi.
[13 Sep 2016 12:15] Chiranjeevi Battula
output:

Traceback (most recent call last):
  File "D:/Python/82948.py", line 8, in <module>
    [("test data inserting", 5, 1, 8, 7.95 ),("test data inserting", 3, 2, 0, 3.95 ),("test data inserting", 0, 4, 3, 5.95 )
  File "C:\Python27\lib\site-packages\mysql\connector\cursor.py", line 616, in executemany
    stmt = self._batch_insert(operation, seq_params)
  File "C:\Python27\lib\site-packages\mysql\connector\cursor.py", line 546, in _batch_insert
    fmt = matches.group(1).encode(self._connection.charset)
LookupError: unknown encoding: utf8mb4
[12 Jul 2017 15:32] Paul DuBois
Posted by developer:
 
Fixed in 2.1.7.

With a connection character set of utf8mb4, multiple-row insert
operations failed with an error of "LookupError: unknown encoding:
utf8mb4".