Bug #90585 Cursor dictionary returning byte literal keys
Submitted: 24 Apr 2018 0:00 Modified: 10 May 2022 21:15
Reporter: Russell Sayers Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S2 (Serious)
Version:8.0.11 OS:Any
Assigned to: CPU Architecture:Any

[24 Apr 2018 0:00] Russell Sayers
Description:
I can run this code on mysql-connector-pytho version 8.0.6, and I do not see the byte literal keys.  

When run on version 8.0.11 I see the byte literals.  In the code below the dictionary key is b'created_datetime' - where the b represents a byte literal in python 3.  I don't see this happening in other columns

I am running Python 3.6.2.

The table I am working with looks like this:

mysql> describe Photos.photo;
+------------------+--------------+------+-----+-------------------+-------+
| Field            | Type         | Null | Key | Default           | Extra |
+------------------+--------------+------+-----+-------------------+-------+
| object_key       | varchar(80)  | NO   | PRI | NULL              |       |
| labels           | varchar(200) | YES  |     | NULL              |       |
| description      | varchar(200) | YES  |     | NULL              |       |
| cognito_username | varchar(150) | YES  |     | NULL              |       |
| created_datetime | datetime     | YES  |     | CURRENT_TIMESTAMP |       |
+------------------+--------------+------+-----+-------------------+-------+

How to repeat:
## mysql-connector-python==8.0.6

>>> import mysql.connector
>>> conn = mysql.connector.connect(user="master", password="--snip--", host="--snip--", database="Photos") 
>>> cursor = conn.cursor(dictionary=True)
>>> cursor.execute("SELECT created_datetime from photo")
>>> cursor.fetchall()
[{'created_datetime': datetime.datetime(2018, 4, 23, 22, 50, 26)}, {'created_datetime': datetime.datetime(2018, 4, 23, 21, 16, 44)}, {'created_datetime': datetime.datetime(2018, 4, 23, 20, 51, 47)}, {'created_datetime': datetime.datetime(2018, 4, 23, 22, 55, 54)}, {'created_datetime': datetime.datetime(2018, 4, 23, 20, 41, 34)}]

## mysql-connector-python==8.0.11

>>> import mysql.connector
>>> conn = mysql.connector.connect(user="master", password="--snip--", host="--snip--", database="Photos") 
>>> cursor = conn.cursor(dictionary=True)
>>> cursor.execute("SELECT created_datetime from photo")
>>> cursor.fetchall()
[{b'created_datetime': datetime.datetime(2018, 4, 23, 22, 50, 26)}, {b'created_datetime': datetime.datetime(2018, 4, 23, 21, 16, 44)}, {b'created_datetime': datetime.datetime(2018, 4, 23, 20, 51, 47)}, {b'created_datetime': datetime.datetime(2018, 4, 23, 22, 55, 54)}, {b'created_datetime': datetime.datetime(2018, 4, 23, 20, 41, 34)}]

Suggested fix:
Return to the plain string keys for the returned dictionary. This is a regression that will impact a lot of users.
[26 Apr 2018 19:55] Russell Sayers
Hello Chiranjeevi, let me know if you need any help reproducing the issue.  I could try reproduce the issue in a docker container and make it available for you.
[27 Apr 2018 11:22] Chiranjeevi Battula
Hello Russell Sayers,

Thank you for the bug report and test case.
Verified this behavior on MySQL Connector/Python 8.0.11.

Thanks,
Chiranjeevi.
[27 Apr 2018 11:23] Chiranjeevi Battula
Results:
## mysql-connector-python==8.0.6

>>> import mysql.connector
>>> conn = mysql.connector.connect(user="root", password="chiran", host="localhost", database="test", port =3306)
>>> cursor = conn.cursor(dictionary=True)
>>> cursor.execute("SELECT created_datetime from photo")
>>> cursor.fetchall()
[{'created_datetime': datetime.datetime(2018, 4, 27, 15, 51, 42)}, {'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 30)}, {'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 32)}, {'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 34)}, {'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 36)}]
>>> 

## mysql-connector-python==8.0.11

>>> 
>>> import mysql.connector
>>> conn = mysql.connector.connect(user="root", password="chiran", host="localhost", database="test", port =3306)
>>> cursor = conn.cursor(dictionary=True)
>>> cursor.execute("SELECT created_datetime from photo")
>>> cursor.fetchall()
[{b'created_datetime': datetime.datetime(2018, 4, 27, 15, 51, 42)}, {b'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 30)}, {b'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 32)}, {b'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 34)}, {b'created_datetime': datetime.datetime(2018, 4, 27, 15, 57, 36)}]
>>>
[27 May 2018 12:00] Arek M
C extension is causing this.

If you use pure python implementation (use_pure=False in connect()) then you get str. If use_pure=True (default now in 8.0.11) you get bytes.

All versions seem to be affected, including 2.1.3 release if use_pure is set to False.

Bug was introduced by:

commit 1a3e3c1d9dbc0829a335e3d30cde8d4cbe849699
Author: Geert Vanderkelen <geert.vanderkelen@oracle.com>
Date:   Sun Aug 23 14:02:44 2015 +0200

    BUG21535573: Fix character decoding of identifiers using CExtension
    
    We fix in CExtension the decoding of column names, table name and
    database name in field information of result sets. When the decoding
    fails, an error is now raised (instead of a segmentation fault).
    
    A test case was added for BUG21535573 which is testing GBK, SJIS and
    BIG5 characters in column names.
    We also update a few other tests cases.
    (cherry picked from commit 6e4649c)
[27 May 2018 12:02] Arek M
Obviously I meant:

If you use pure python implementation (use_pure=True in connect()) then you get str. If use_pure=False (default now in 8.0.11) you get bytes.
[27 May 2018 15:08] Russell Sayers
Thanks Arkadiusz - ahh, I see use_pure false is now the default. For now I'll switch my code to use_pure=true.
[27 May 2018 15:13] Arek M
mytopy_string() is doing that:

#ifndef PY3
        return PyStringFromStringAndSize(data, length);
#else
        return PyBytes_FromStringAndSize(data, length);
#endif

and mytopy_string is used for conversion after commit 1a3e3c1d9dbc0829a335e3d30cde8d4cbe849699
[28 May 2018 6:11] Arek M
From other project:

https://github.com/sequelpro/sequelpro/issues/2700

"The BINARY_FLAG in MySQL indicates that a field is using a _bin collation for the server side column type.
Most notably, it has NO relation to

whether a field contains binary or string data
whether the representation of the data we received from the server (which goes through charset conversion) is some kind of binary data or not
The only thing that decides if the representation of the data should be treated as NSData * or NSString * is charsetnr == MAGIC_BINARY_CHARSET_NR (keep in mind: The value of charsetnr is the one of the charset we received in the result, not the actual charset of the column on the server!)

BINARY_FLAG has no actual worth to us."

so mytopy_string() is incorrectly using BINARY_FLAG.

https://dev.mysql.com/doc/refman/5.7/en/c-api-data-structures.html

"To distinguish between binary and nonbinary data for string data types, check whether the charsetnr value is 63. If so, the character set is binary, which indicates binary rather than nonbinary data. This enables you to distinguish BINARY from CHAR, VARBINARY from VARCHAR, and the BLOB types from the TEXT types."
[28 May 2018 6:58] Arek M
checking charset is enough and using BINARY_FLAG is incorrect in this context

Attachment: binary-bug-90585.patch (text/x-patch), 856 bytes.

[28 May 2018 8:06] Arek M
Use charsetnr for detecting binary field but assume metadata is always in utf8.

Attachment: 0001-Use-charsetnr-for-detecting-binary-results-in-mytopy.patch (text/x-patch), 6.23 KiB.

[28 May 2018 8:08] Arek M
doc updated

Attachment: 0001-Use-charsetnr-for-detecting-binary-results-in-mytopy.patch (text/x-patch), 6.74 KiB.

[14 Aug 2018 3:35] Jesper wisborg Krogh
Posted by developer:
 
I can reproduce the issue in 8.0.11 (but not in 8.0.12) under the following conditions:

* Microsoft Windows 10 (I don't expect this to be important)
* Python 3.6.6 (I suspect all 3.x versions will reproduce)
* MySQL Connector/Python 8.0.11 (likely exists in older releases as well)
* The C Extension is enabled (this is the default in 8.0.11+)
* A temporal column is selected.

Then using the following test program:

import mysql.connector

cnx = mysql.connector.connect(
    user="root",
    password="password",
    host="127.0.0.1",
    port=3306,
    # use_pure=True,
)

cursor = cnx.cursor(named_tuple=True)
cursor.execute("SELECT NOW() AS Now")
print(cursor.fetchone())
cursor.close()

cnx.close()

This fails like:

PS C:\> python test.py
Traceback (most recent call last):
  File "build\wininst\Lib\site-packages\mysql\connector\cursor_cext.py", line 763, in _handle_resultset
  File "C:\Users\myuser\AppData\Local\Programs\Python\Python36\lib\weakref.py", line 137, in __getitem__
    o = self.data[key]()
KeyError: (b'Now',)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 12, in <module>
    cursor.execute("SELECT NOW() AS Now")
  File "build\wininst\Lib\site-packages\mysql\connector\cursor_cext.py", line 270, in execute
  File "build\wininst\Lib\site-packages\mysql\connector\cursor_cext.py", line 163, in _handle_result
  File "build\wininst\Lib\site-packages\mysql\connector\cursor_cext.py", line 765, in _handle_resultset
  File "C:\Users\myuser\AppData\Local\Programs\Python\Python36\lib\collections\__init__.py", line 401, in namedtuple
    'identifiers: %r' % name)
ValueError: Type names and field names must be valid identifiers: "b'Now'"

With 8.0.12 the test program works:

PS C:\> python test.py
Row(Now=datetime.datetime(2018, 8, 14, 13, 25, 46))

So I suggest upgrading to Connector/Python 8.0.12.
[4 May 2022 14:26] Nuno Mariz
Posted by developer:
 
This bug was fixed in Connector/Python 8.0.12.
[10 May 2022 21:15] Philip Olson
This was fixed in 8.0.12 via WL #11951 and a host of other bug reports, such as MySQL Bug #90799 -- thank you for the bug report! Setting status to closed.