Bug #781 four-byte UTF8 characters
Submitted: 2 Jul 2003 7:57 Modified: 2 Jul 2003 10:54
Reporter: Matthias Urlichs Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:4.1 OS:
Assigned to: Paul DuBois CPU Architecture:Any

[2 Jul 2003 7:57] Matthias Urlichs
The three-byte rule is no longer true. Unicode now defines characters which require four UTF8 bytes (or two UCS-2 words) to store.

I haven't yet tested whether MySQL handles these correctly.

How to repeat:
Ummm... it's a documentation bug (hopefully ;-).
[2 Jul 2003 7:58] Matthias Urlichs
used the wrong Documentation menu entry
[2 Jul 2003 10:54] Paul DuBois
The documentation doesn't refer to any three-byte rule.
However, MySQL does not currently support four-byte
UTF sequences. I've added a note to that efffect to the
relevant page: