Bug #57104 STRCMP()
Submitted: 29 Sep 2010 15:16 Modified: 14 Oct 2010 16:19
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:any 4.1+ OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: qc

[29 Sep 2010 15:16] Peter Laursen
Description:
http://dev.mysql.com/doc/refman/5.1/en/string-comparison-functions.html#function_strcmp

"STRCMP() uses the current character set when performing comparisons. This makes the default comparison behavior case insensitive unless one or both of the operands are binary strings."

1)
I think this is not updated reflecting introduction of COLLATIONS in MySQL.  If 'current charset + current collation' is a case sensitive collation this makes the statement " ..unless one or both of the operands are binary strings." incomplete. It will also be case sensitive with a case sensitive collation (what is not listed - but OK, there are not many case sensitive collations)

2)
And STRCMP() cannot compare strings with different collations at all) as shown below

How to repeat:
SELECT STRCMP((SELECT 'a' COLLATE utf8_unicode_ci), (SELECT 'a' COLLATE utf8_general_ci));

/* returns

Error Code : 1267
Illegal mix of collations (utf8_unicode_ci,EXPLICIT) and (utf8_general_ci,EXPLICIT) for operation 'strcmp' 
*/

Suggested fix:
update docs!
[29 Sep 2010 15:18] Peter Laursen
ooppss .. category was wrong!
[29 Sep 2010 18:42] Peter Laursen
I think the quoted passage is an old thing that went through unchanged from 3.x/4.0 docs.

I suggest a *generic* page titled "How MySQL compares strings" that other pages can refer to. 

For instance it is pretty much confusing that you can compare different charsets but not different collations with =, <>, LIKE, STRCMP etc. (what I have before reported as a bug, and even though this report was closed a *not a bug* I personally still think it *is* a bug!).  But such generic page could sort out this *opaque* thing (term borrowed from Sveta's presentation UC 2008 - with thanks!).
[29 Sep 2010 21:45] MySQL Verification Team
Thank you for the bug report.
[4 Oct 2010 11:30] Peter Laursen
illustration:

set character_set_connection = latin1;

select strcmp((select 'a' collate latin1_general_cs),(select 'a' collate latin1_general_cs)); -- returns 0
select strcmp((select 'A' collate latin1_general_cs),(select 'a' collate latin1_general_cs)); -- returns -1
select strcmp((select 'a' collate latin1_general_cs),(select 'A' collate latin1_general_cs)); -- returns 1
[14 Oct 2010 15:36] Paul DuBois
You are correct. STRCMP() uses the collation of the arguments as the basis of the comparison.

The language you see was correct before 4.1 when better character set support was introduced. I'll update this.
[14 Oct 2010 16:19] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.