Bug #70596 ProgrammingError: Character set 'utf8mb4' unsupported.
Submitted: 10 Oct 2013 19:01 Modified: 18 Dec 2013 17:40
Reporter: Florian Nierhaus Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S3 (Non-critical)
Version:1.0.10 OS:Any
Assigned to: Geert Vanderkelen CPU Architecture:Any
Tags: utf8mb4 utf8

[10 Oct 2013 19:01] Florian Nierhaus
Description:
utf8mb4_* missing in python.connector.constants.CharacterSet.desc

utf8mb4 is supported in mysql 5.5.3+ and required for 4-Byte utf-8 support.

The hard coded constants copied from the mysql tables are not up to date and need to be updated.

How to repeat:
create a table and some varchars with CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

read write emoticons e.g.

Suggested fix:
SELECT id, collation_name FROM information_schema.collations ORDER BY id;
and create the CharacterSet data dynamically when the first connection to mysql is made and then cache the data.

Quick and dirty fix is to update the hard coded values (see below). While you are at it you may want to update other missing values as well.

--- a/python2/mysql/connector/constants.py
+++ b/python2/mysql/connector/constants.py
@@ -672,6 +672,39 @@ class CharacterSet(_constants):
       ("utf8","utf8_persian_ci",False), # 208
       ("utf8","utf8_esperanto_ci",False), # 209
       ("utf8","utf8_hungarian_ci",False), # 210
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      None,
+      ("utf8mb4","utf8mb4_unicode_ci",False), # 224 
+      ("utf8mb4","utf8mb4_icelandic_ci",False), # 225 
+      ("utf8mb4","utf8mb4_latvian_ci",False), # 226 
+      ("utf8mb4","utf8mb4_romanian_ci",False), # 227 
+      ("utf8mb4","utf8mb4_slovenian_ci",False), # 228 
+      ("utf8mb4","utf8mb4_polish_ci",False), # 229 
+      ("utf8mb4","utf8mb4_estonian_ci",False), # 230 
+      ("utf8mb4","utf8mb4_spanish_ci",False), # 231 
+      ("utf8mb4","utf8mb4_swedish_ci",False), # 232 
+      ("utf8mb4","utf8mb4_turkish_ci",False), # 233 
+      ("utf8mb4","utf8mb4_czech_ci",False), # 234 
+      ("utf8mb4","utf8mb4_danish_ci",False), # 235 
+      ("utf8mb4","utf8mb4_lithuanian_ci",False), # 236 
+      ("utf8mb4","utf8mb4_slovak_ci",False), # 237 
+      ("utf8mb4","utf8mb4_spanish2_ci",False), # 238 
+      ("utf8mb4","utf8mb4_roman_ci",False), # 239 
+      ("utf8mb4","utf8mb4_persian_ci",False), # 240 
+      ("utf8mb4","utf8mb4_esperanto_ci",False), # 241 
+      ("utf8mb4","utf8mb4_hungarian_ci",False), # 242 
+      ("utf8mb4","utf8mb4_sinhala_ci",False), # 243 
     ]
 
     @classmethod
[17 Oct 2013 22:24] Florian Nierhaus
forgot two lines:

--- a/python2/mysql/connector/constants.py
+++ b/python2/mysql/connector/constants.py
@@ -506,8 +506,8 @@ class CharacterSet(_constants):
       ("latin7","latin7_general_cs",False), # 42
       ("macce","macce_bin",False), # 43
       ("cp1250","cp1250_croatian_ci",False), # 44
-      None,
-      None,
+      ("utf8mb4", "utf8mb4_general_ci", True), #45
+      ("utf8mb4", "utf8mb4_bin", False), #46
       ("latin1","latin1_bin",False), # 47
       ("latin1","latin1_general_ci",False), # 48
       ("latin1","latin1_general_cs",False), # 49
[18 Dec 2013 17:40] Paul DuBois
Noted in 1.1.5 changelog.

utf8mb4 was not recognized as a valid character set.