Bug #5168 Chinese Character Search Problem
Submitted: 23 Aug 2004 13:18 Modified: 23 Aug 2004 17:38
Reporter: Thomas Chan Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.0.17, 4.0.20 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[23 Aug 2004 13:18] Thomas Chan
Description:
We experienced the unexpected results when we tried to keyin one Chinese Character for searching (Not All). One of Chinese Character 'Ä_' is entered and its hex code should be 'c45f'.  When we performed the record searching, some records with only hex code 'c4' also come out on screen. Our SQL syntax are as followings for your reference :-

  -"select * from bvidetail where bvicname like '%".trim($pcomp).."%'";
  -The parameter of $pcomp is entered as 'Ä_' -hex code should be :c45f

We expected the result will only have 5 records, but we found that an additional 9 records are come out with the binary code 'c4' only (supposing the chinese character should have 2 bytes).

Our MySQL version is using 4.0.17 but we have been tried to upgrade it to 4.0.20 but with the same result come out.

Could anyone advise how can ensure the select statement will only retreive the record with binary code ='c45f' ?

How to repeat:
1) Create a table as follows :-

CREATE DATABASE `mysqlbvi`;
USE mysqlbvi;
#
# Table structure for table `bvidetail`
#
CREATE TABLE `bvidetail` (
  `indexseq` int(10) NOT NULL default '0',
  `bvijurisdic` varchar(10) NOT NULL default '',
  `bvidesc` varchar(80) NOT NULL default '',
  `bvikeyword` varchar(20) NOT NULL default '',
  `bviwhloc` varchar(15) NOT NULL default '',
  `bviename` text NOT NULL,
  `bvicname` blob NOT NULL,
  `bviregdate` date NOT NULL default '0000-00-00',
  KEY `indexseq` (`indexseq`)
) TYPE=MyISAM;

2) Insert data as follows :-
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (64, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'ENTERPRISES', 'Hong Kong', 'BRIGHT FORTUNE GROUP LIMITED', 0xc4a3b7e7b6b0b9cea6b3adada4bda571, '2004-03-29');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (23, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'INTERNATIONAL', 'Hong Kong', 'BROADYIELD INTERNATIONAL LIMITED', 0xb3d5c4a3b0eabbdaa6b3adada4bda571, '2004-08-09');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (23, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'INVESTMENTS', 'Hong Kong', 'CHARMPRO INVESTMENTS LIMITED', 0xb3d0c45fa7ebb8eaa6b3adada4bda571, '2004-08-09');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (67, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'CORPORATION', 'Hong Kong', 'DOUBLE CONSULTANTS LIMITED', 0xb16fc45fc555b0dda6b3adada4bda571, '2004-04-23');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (44, 'Hong Kong', 'HK - English & Chinese Name (Share Capital of HK$10,000)', 'LIMITED', 'Hong Kong', 'FINE ALLIED LIMITED', 0xc041c4d6a6b3adada4bda571, '2004-05-10');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (73, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'LIMITED', 'Hong Kong', 'GOLD ORCHID HOLDINGS LIMITED', 0xaaf7c4f5b1b1aad1a6b3adada4bda571, '2003-03-20');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (30, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'TRADING', 'Hong Kong', 'GRAND YIELD TRADING LIMITED', 0xb771c4a3b654a9f6a6b3adada4bda571, '2004-07-01');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (73, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'ENTERPRISES', 'Hong Kong', 'GREATPLAN ENTERPRISES LIMITED', 0xc450b9cfa5f8b77ea6b3adada4bda571, '2004-04-28');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (79, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'INTERNATIONAL', 'Hong Kong', 'POLESTAR INTERNATIONAL LIMITED', 0xc45fac50b0eabbdaa6b3adada4bda571, '2004-02-20');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (81, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered & Bearer Sh', 'INC.', 'Hong Kong', 'SPLENDID RESULT INC.', 0xa8ceaec4a6b3adada4bda571, '2004-04-28');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (32, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'GROUP', 'Hong Kong', 'TOP YIELD GROUP LIMITED', 0xa9ddc4a3b6b0b9cea6b3adada4bda571, '2004-08-09');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (83, 'Samoa', 'SAM - English & Chinese Name (M&A in English & Chinese) - Registered Shares Only', 'INVESTMENTS', 'Hong Kong', 'TREASURE JADE INVESTMENTS LIMITED', 0xc45fa5c9a7ebb8eaa6b3adada4bda571, '2004-04-23');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (32, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'CONSULTANTS', 'Hong Kong', 'TRIPRO CONSULTANTS LIMITED', 0xa454c45fc555b0dda6b3adada4bda571, '2004-07-28');
INSERT INTO `bvidetail` (`indexseq`, `bvijurisdic`, `bvidesc`, `bvikeyword`, `bviwhloc`, `bviename`, `bvicname`, `bviregdate`) VALUES (32, 'BVI', 'BVI - Registered Shares Only with English & Chinese Name', 'ENTERPRISES', 'Hong Kong', 'WELL PLAN ENTERPRISES LIMITED', 0xb0b6c450a5f8b77ea6b3adada4bda571, '2004-07-15');

- The proper search record should only with hex code 'c45f' - 5 records only !!
[23 Aug 2004 17:38] Hartmut Holzgraefe
This happens as '_' is the LIKE metacharacter for 'any single character' 
like '%' is 'any number of characters'. You can escape the search 
metacharacters by prepending a backslash '\'.

From your code snippet i assume that you are using PHP.
There you could do the following:

   "select * from bvidetail where bvicname like '%".addcslashes(trim($pcomp),"\\_%")."%' ";