Bug #99866 Wrong cast from Python unicode to std::string
Submitted: 14 Jun 2020 5:37 Modified: 12 Feb 2021 21:42
Reporter: Naoki Inada Email Updates:
Status: Closed Impact on me:
Category:Connector / Python Severity:S3 (Non-critical)
Version:8.0.20 OS:Any
Assigned to: CPU Architecture:Any

[14 Jun 2020 5:37] Naoki Inada
See these code.


#define PyString_CheckExact PyUnicode_CheckExact
#define PyString_AsString PyUnicode_AsUTF8
#define PyString_FromString PyUnicode_FromString
#define PyString_Size PyUnicode_GetSize


std::string python_cast<std::string>(PyObject* obj) {
  if (PyString_CheckExact(obj)) {
    return std::string(PyString_AsString(obj), PyString_Size(obj));

PyString_AsString() returns UTF-8, but PyString_Size returns length of wchar_t string.  Returned string will be trimmed in wrong place.

Additionally, PyUnicode_GetSize is deprecated.
Please use PyUnicode_AsUTF8AndSize instead.


How to repeat:
I don't know how python_cast is used.  I find this bug by just grepping deprecated API.
[15 Jun 2020 6:12] MySQL Verification Team
Hello Naoki,

Thank you for the report and feedback.

[12 Feb 2021 21:42] Philip Olson
Posted by developer:
Fixed as of the upcoming MySQL Connector/Python 8.0.24 release, and here's the proposed changelog entry from the documentation team:

Replaced the deprecated PyUnicode_GetSize with PyUnicode_GET_LENGTH to
fix the casting of Python's unicode to std::string.

Thank you for the bug report.