Bug #99866 Wrong cast from Python unicode to std::string
Submitted: 14 Jun 2020 5:37 Modified: 12 Feb 21:42
Reporter: Naoki Inada Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / Python Severity:S3 (Non-critical)
Version:8.0.20 OS:Any
Assigned to: CPU Architecture:Any

[14 Jun 2020 5:37] Naoki Inada
Description:
See these code.

https://github.com/mysql/mysql-connector-python/blob/e424cbf2ba6093caaa96bda1db5dbdfec2e60...

#define PyString_CheckExact PyUnicode_CheckExact
#define PyString_AsString PyUnicode_AsUTF8
#define PyString_FromString PyUnicode_FromString
#define PyString_Size PyUnicode_GetSize

https://github.com/mysql/mysql-connector-python/blob/master/src/mysqlxpb/python_cast.h#L12...

std::string python_cast<std::string>(PyObject* obj) {
  if (PyString_CheckExact(obj)) {
    return std::string(PyString_AsString(obj), PyString_Size(obj));
  }

PyString_AsString() returns UTF-8, but PyString_Size returns length of wchar_t string.  Returned string will be trimmed in wrong place.

Additionally, PyUnicode_GetSize is deprecated.
Please use PyUnicode_AsUTF8AndSize instead.

https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AsUTF8AndSize

How to repeat:
I don't know how python_cast is used.  I find this bug by just grepping deprecated API.
[15 Jun 2020 6:12] MySQL Verification Team
Hello Naoki,

Thank you for the report and feedback.

Thanks,
Umesh
[12 Feb 21:42] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Connector/Python 8.0.24 release, and here's the proposed changelog entry from the documentation team:

Replaced the deprecated PyUnicode_GetSize with PyUnicode_GET_LENGTH to
fix the casting of Python's unicode to std::string.

Thank you for the bug report.