Bug #64775 Left join on federated.varchar(15) = innodb.char(25) gives bad query plan, crash
Submitted: 27 Mar 2012 9:53 Modified: 28 Apr 2012 14:20
Reporter: Desmond Coertzen Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Federated storage engine Severity:S2 (Serious)
Version:5.1.61 OS:Linux (open suse 11.4)
Assigned to: CPU Architecture:Any

[27 Mar 2012 9:53] Desmond Coertzen
Description:
*** Problem:

Extract from report SQL:

/*db-almost-link to contractor zone*/
/*jhb*/
left join fed_jhb_gen_Main gm_jhb on gm_jhb.mIDNumber = ls.IDNumber
/*durban*/
left join fed_dbnr_gen_Main gm_dbn on gm_dbn.mIDNumber = ls.IDNumber
/*cape town*/
left join fed_cptr_gen_Main gm_cpt on gm_cpt.mIDNumber = ls.IDNumber

fed_*_gen_Main.mIDNumber field on federated tables is varchar(15).
loan_Staff.IDNumber field on local innodb table is char(25).

fed_*_gen_Main are replicas of the real tables, which are using MyISAM storage engines.

When I run this query, mysqld is killed by oom on a server with 32GB physical RAM.

Query plan shows sequential scan for fed_*_gen_Main:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	gm_jhb	ALL	NULL	NULL	NULL	NULL	147258
1	PRIMARY	gm_dbn	ALL	NULL	NULL	NULL	NULL	159479
1	PRIMARY	gm_cpt	ALL	NULL	NULL	NULL	NULL	90326

When doing a TCP dump between "master" containing federated and "slave" containing physical tables, the master seems to do "select [...all fields...] from gen_Main" without a where-clause. It does this repeatedly. I suppose the other conditions in the entire query is causing this. But it seems that the entire table is being cached over and over which may explain where my 32GB RAM + 32GB swap is disappearing to very quickly.

*** 1st workaround:

If i modify the the join clause to cast the localInnodb.IDNumber field to char, the planner seems to work and the query executes without problems:

/*db-almost-link to contractor zone*/
/*jhb*/
left join fed_jhb_gen_Main gm_jhb on gm_jhb.mIDNumber = cast(ls.IDNumber as char)
/*durban*/
left join fed_dbnr_gen_Main gm_dbn on gm_dbn.mIDNumber = cast(ls.IDNumber as char)
/*cape town*/
left join fed_cptr_gen_Main gm_cpt on gm_cpt.mIDNumber = cast(ls.IDNumber as char)

The resulting query plan looks way better:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	gm_jhb	eq_ref	mIDNumber	mIDNumber	27	func	1
1	PRIMARY	gm_dbn	eq_ref	mIDNumber	mIDNumber	27	func	1
1	PRIMARY	gm_cpt	eq_ref	mIDNumber	mIDNumber	27	func	1

Howevery, when the report SQL is executed in a prepared statement, the workaround no longer works and the query plan looks like the bad one at the top.

*** 2nd workaround:

Invent benign function that takes a varchar as a parameter and return it as is:

delimiter ;;
create function sp_stringpass(astring varchar(30)) returns varchar(30)
deterministic
begin
  return astring;
end;;
delimiter ;

Then modify the left joins to use the function on the localInnoDB.IDNumber fields:

/*db-almost-link to contractor zone*/
/*jhb*/
left join fed_jhb_gen_Main gm_jhb on gm_jhb.mIDNumber = sp_stringpass(ls.IDNumber)
/*durban*/
left join fed_dbnr_gen_Main gm_dbn on gm_dbn.mIDNumber = sp_stringpass(ls.IDNumber)
/*cape town*/
left join fed_cptr_gen_Main gm_cpt on gm_cpt.mIDNumber = sp_stringpass(ls.IDNumber)

The resulting query plan looks the same as the first workaround, but, the workaround is valid in a prepared statement as well.

We implemented the sp_stringpass function on all our reports that does left join to federated table on varchar, and they all work without crashing the server.

*** The DDL for the tables:

fed_*_gen_Main:

CREATE TABLE `fed_dbnr_gen_Main` (
  `WKey` int(11) NOT NULL,
  `ANumber` int(11) DEFAULT NULL,
  `UID` int(11) DEFAULT NULL,
  `mTitle` varchar(6) DEFAULT NULL,
  `mName` varchar(100) NOT NULL DEFAULT '',
  `mSurname` varchar(100) NOT NULL DEFAULT '',
  `mFullName` varchar(200) NOT NULL DEFAULT '',
  `mPrefName` varchar(50) DEFAULT NULL,
  `mBirthDate` varchar(11) DEFAULT NULL,
  `mIDNumber` varchar(25) NOT NULL DEFAULT '',
  `mMemNo` varchar(11) NOT NULL DEFAULT '',
  `mMartStatus` int(2) DEFAULT NULL,
  `mNationality` varchar(20) DEFAULT NULL,
  `mRace` varchar(15) DEFAULT NULL,
  `mHomeLang` varchar(20) DEFAULT NULL,
  `mOtherLang` varchar(100) DEFAULT NULL,
  `mStatus` int(2) NOT NULL DEFAULT '0',
  `mEmpType` int(2) NOT NULL DEFAULT '0',
  `mCaptureDate` varchar(11) DEFAULT NULL,
  `ClientNo` int(11) DEFAULT NULL,
  `mDependants` int(2) DEFAULT NULL,
  `mPWD` varchar(15) DEFAULT NULL,
  `mPermExp` varchar(11) DEFAULT NULL,
  `TempCareKey` int(15) NOT NULL DEFAULT '0',
  `mPassportNo` char(16) DEFAULT '',
  `PrefEmp` int(11) DEFAULT '0',
  `CVRef` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`WKey`),
  UNIQUE KEY `mMemNo` (`mMemNo`),
  UNIQUE KEY `mIDNumber` (`mIDNumber`),
  KEY `ClientNo` (`ClientNo`),
  KEY `TempCareKey` (`TempCareKey`)
) ENGINE=FEDERATED DEFAULT CHARSET=latin1 CONNECTION='mysql://user@host/DbnGlobal/gen_Main'

Actual table on remote host:

CREATE TABLE `gen_Main` (
  `WKey` int(11) NOT NULL AUTO_INCREMENT,
  `ANumber` int(11) DEFAULT NULL,
  `UID` int(11) DEFAULT NULL,
  `mTitle` varchar(6) DEFAULT NULL,
  `mName` varchar(100) NOT NULL DEFAULT '',
  `mSurname` varchar(100) NOT NULL DEFAULT '',
  `mFullName` varchar(200) NOT NULL DEFAULT '',
  `mPrefName` varchar(50) DEFAULT NULL,
  `mBirthDate` varchar(11) DEFAULT NULL,
  `mIDNumber` varchar(15) NOT NULL DEFAULT '',
  `mMemNo` varchar(11) NOT NULL DEFAULT '',
  `mMartStatus` int(2) DEFAULT NULL,
  `mNationality` varchar(20) DEFAULT NULL,
  `mRace` varchar(15) DEFAULT NULL,
  `mHomeLang` varchar(20) DEFAULT NULL,
  `mOtherLang` varchar(100) DEFAULT NULL,
  `mStatus` int(2) NOT NULL DEFAULT '0',
  `mEmpType` int(2) NOT NULL DEFAULT '0',
  `mCaptureDate` varchar(11) DEFAULT NULL,
  `ClientNo` int(11) DEFAULT NULL,
  `mDependants` int(2) DEFAULT NULL,
  `mPWD` varchar(15) DEFAULT NULL,
  `mPermExp` varchar(11) DEFAULT NULL,
  `TempCareKey` int(15) NOT NULL DEFAULT '0',
  `mPassportNo` char(16) DEFAULT '',
  `PrefEmp` int(11) DEFAULT '0',
  `CVRef` varchar(50) DEFAULT NULL,
  `Internal` int(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`WKey`),
  UNIQUE KEY `mMemNo` (`mMemNo`),
  UNIQUE KEY `mIDNumber` (`mIDNumber`),
  KEY `ClientNo` (`ClientNo`),
  KEY `TempCareKey` (`TempCareKey`),
  KEY `Internal` (`Internal`)
) ENGINE=MyISAM AUTO_INCREMENT=160041 DEFAULT CHARSET=latin1

How to repeat:
See description
[27 Mar 2012 10:22] Valeriy Kravchuk
What exactly you consider a bug in this case?
[27 Mar 2012 10:36] Desmond Coertzen
This query plan:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	gm_jhb	ALL	NULL	NULL	NULL	NULL	147258
1	PRIMARY	gm_dbn	ALL	NULL	NULL	NULL	NULL	159479
1	PRIMARY	gm_cpt	ALL	NULL	NULL	NULL	NULL	90326

... and the resulting select * from remote_table_pointed_to_by_federated over and over until mysqld crashes on oom.

Why did we need to do the weird casting to make the planner use the index?
Why did this workaround by casting not work for the same query when executed as a prepared query?

Seems like bugs, or undefined results, in the query planner when it comes to federated tables.
[27 Mar 2012 17:30] Valeriy Kravchuk
Please, send the output of 

EXPLAIN EXTENDED <your exact query>;
SHOW WARNINGS\G
[28 Mar 2012 4:49] Desmond Coertzen
Output as requested for bad query plan:

*** explain extended:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	lc	ref	StaffKey,indx_lc_staffloanforgranted,indx_lc_granted	indx_lc_granted	5	const	973	100.00	Using where; Using temporary; Using filesort
1	PRIMARY	g	eq_ref	PRIMARY	PRIMARY	4	micro.lc.LoanFor	1	100.00	
1	PRIMARY	lsc	eq_ref	PRIMARY	PRIMARY	4	micro.lc.Client	1	100.00	
1	PRIMARY	a	eq_ref	PRIMARY	PRIMARY	4	micro.lc.AgKey	1	100.00	
1	PRIMARY	ls	eq_ref	PRIMARY	PRIMARY	4	micro.lc.StaffKey	1	100.00	
1	PRIMARY	lsd	ref	unq_lsd_staffkey,StaffKey	unq_lsd_staffkey	5	micro.lc.StaffKey	1	100.00	
1	PRIMARY	gm_jhb	ALL	NULL	NULL	NULL	NULL	147285	100.00	
1	PRIMARY	conb_jhb	ref	WKey	WKey	5	micro.gm_jhb.WKey	11	106.06	
1	PRIMARY	lpd_jhb	eq_ref	PRIMARY	PRIMARY	4	micro.conb_jhb.DepID	1	100.00	
1	PRIMARY	gm_dbn	ALL	NULL	NULL	NULL	NULL	159559	100.00	
1	PRIMARY	conb_dbn	ref	WKey	WKey	5	micro.gm_dbn.WKey	11	106.07	
1	PRIMARY	lpd_dbn	eq_ref	PRIMARY	PRIMARY	4	micro.conb_dbn.DepID	1	100.00	
1	PRIMARY	gm_cpt	ALL	NULL	NULL	NULL	NULL	90429	100.00	
1	PRIMARY	conb_cpt	ref	WKey	WKey	5	micro.gm_cpt.WKey	11	106.06	
1	PRIMARY	lpd_cpt	eq_ref	PRIMARY	PRIMARY	4	micro.conb_cpt.DepID	1	100.00	
1	PRIMARY	<derived2>	ALL	NULL	NULL	NULL	NULL	1184	100.00	
1	PRIMARY	lcds	ref	fk_lcds_application	fk_lcds_application	4	micro.lc.LoanKey	1	100.00	
1	PRIMARY	cds	eq_ref	PRIMARY	PRIMARY	4	micro.lcds.contract_delivery_status_id	1	100.00	
1	PRIMARY	u_lcds	eq_ref	PRIMARY	PRIMARY	4	micro.lcds.created_by_bbrk_user_id	1	100.00	
2	DERIVED	lcds	index	NULL	fk_lcds_application	4	NULL	2279	100.00	

*** warnings:

Level	Code	Message
Note	1003	select `micro`.`lsc`.`client_name` AS `Business Source`,`micro`.`lc`.`ContractNo` AS `Contract No`,`micro`.`lc`.`LoanDate` AS `Loan Date`,`micro`.`g`.`Description` AS `Product`,`micro`.`g`.`CellCap` AS `Amount`,`micro`.`ls`.`Name` AS `Name`,`micro`.`ls`.`Surname` AS `Surname`,`micro`.`ls`.`IDNumber` AS `ID Number`,coalesce(`micro`.`lsd`.`HomeCellNo`,coalesce(`micro`.`lsd`.`HomeTelephone`,`micro`.`lsd`.`Telephone`)) AS `Phone Number`,`micro`.`lsd`.`CompContr` AS `Site`,`micro`.`lsd`.`HomeAddr3` AS `Area`,`micro`.`a`.`AgentName` AS `Agent Name`,(case when (`micro`.`lc`.`Client` in (1,2,4,5,6,11,15,18,19)) then `micro`.`lpd_jhb`.`DepName` when (`micro`.`lc`.`Client` in (7,8,17)) then `micro`.`lpd_dbn`.`DepName` when (`micro`.`lc`.`Client` in (10,12,14)) then `micro`.`lpd_cpt`.`DepName` when (`micro`.`lc`.`Client` in (16,21,22,100,101)) then 'N/A' when (`micro`.`lc`.`Client` = 20) then 'N/A' else 'Unknown' end) AS `Workfroce Branch`,(case when (isnull(`micro`.`lcds`.`custom_status`) or (`micro`.`lcds`.`custom_status` = '')) then `micro`.`cds`.`description` else concat(`micro`.`cds`.`description`,' : ',`micro`.`lcds`.`custom_status`) end) AS `Last status comment`,`micro`.`u_lcds`.`display_name` AS `Last status comment user`,`micro`.`lcds`.`create_ts` AS `Last status comment date` from `micro`.`loan_Contract` `lc` join `micro`.`loan_Staff` `ls` join `micro`.`gen_contdef` `g` left join `micro`.`loan_staff_client` `lsc` on((`micro`.`lsc`.`loan_staff_client_id` = `micro`.`lc`.`Client`)) left join `micro`.`gen_Agent` `a` on((`micro`.`a`.`AKey` = `micro`.`lc`.`AgKey`)) left join `micro`.`loan_StaffData` `lsd` on(((`micro`.`ls`.`SKey` = `micro`.`lc`.`StaffKey`) and (`micro`.`lsd`.`StaffKey` = `micro`.`lc`.`StaffKey`))) left join `micro`.`fed_jhb_gen_Main` `gm_jhb` on((convert(`micro`.`gm_jhb`.`mIDNumber` using utf8) = `micro`.`ls`.`IDNumber`)) left join `micro`.`fed_jhb_con_Branch` `conb_jhb` on((`micro`.`conb_jhb`.`WKey` = `micro`.`gm_jhb`.`WKey`)) left join `micro`.`fed_jhb_lp_Department` `lpd_jhb` on((`micro`.`lpd_jhb`.`DepID` = `micro`.`conb_jhb`.`DepID`)) left join `micro`.`fed_dbnr_gen_Main` `gm_dbn` on((convert(`micro`.`gm_dbn`.`mIDNumber` using utf8) = `micro`.`ls`.`IDNumber`)) left join `micro`.`fed_dbnr_con_Branch` `conb_dbn` on((`micro`.`conb_dbn`.`WKey` = `micro`.`gm_dbn`.`WKey`)) left join `micro`.`fed_dbnr_lp_Department` `lpd_dbn` on((`micro`.`lpd_dbn`.`DepID` = `micro`.`conb_dbn`.`DepID`)) left join `micro`.`fed_cptr_gen_Main` `gm_cpt` on((convert(`micro`.`gm_cpt`.`mIDNumber` using utf8) = `micro`.`ls`.`IDNumber`)) left join `micro`.`fed_cptr_con_Branch` `conb_cpt` on((`micro`.`conb_cpt`.`WKey` = `micro`.`gm_cpt`.`WKey`)) left join `micro`.`fed_cptr_lp_Department` `lpd_cpt` on((`micro`.`lpd_cpt`.`DepID` = `micro`.`conb_cpt`.`DepID`)) left join (select `micro`.`lcds`.`loan_application_id` AS `loan_application_id`,max(`micro`.`lcds`.`create_ts`) AS `last_update` from `micro`.`loan_contract_delivery_status` `lcds` group by `micro`.`lcds`.`loan_application_id`) `last_lcds` on((`last_lcds`.`loan_application_id` = `micro`.`lc`.`LoanKey`)) left join `micro`.`loan_contract_delivery_status` `lcds` on(((`micro`.`lcds`.`create_ts` = `last_lcds`.`last_update`) and (`micro`.`lcds`.`loan_application_id` = `micro`.`lc`.`LoanKey`))) left join `micro`.`contract_delivery_status` `cds` on((`micro`.`cds`.`contract_delivery_status_id` = `micro`.`lcds`.`contract_delivery_status_id`)) left join `micro`.`bbrk_user` `u_lcds` on((`micro`.`u_lcds`.`bbrk_user_id` = `micro`.`lcds`.`created_by_bbrk_user_id`)) where ((`micro`.`ls`.`SKey` = `micro`.`lc`.`StaffKey`) and (`micro`.`g`.`SKey` = `micro`.`lc`.`LoanFor`) and (`micro`.`lc`.`Granted` = 9) and (`micro`.`lc`.`LoanFor` > 10)) order by `micro`.`lc`.`ContractNo`
[28 Mar 2012 6:23] Valeriy Kravchuk
Looks like this:

...
on((convert(`micro`.`gm_jhb`.`mIDNumber`
using utf8) = `micro`.`ls`.`IDNumber`))
...

is the reason. Your tables accessed via Federated have latin1 charset, while "main" tables use utf8, so to compare server has to convert strings to proper collation and thus can not use indexes.
[28 Mar 2012 7:50] Desmond Coertzen
I see your point.

How is that we are able to get around the charset problem with a cast to char or compare by function result?

Output of variables on server where query runs:

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
[28 Mar 2012 8:43] Valeriy Kravchuk
Please, send the output of

show create table ls\G

Cast and function workarounds help as they both change character set of value to compare to I assume...
[28 Mar 2012 9:43] Desmond Coertzen
ls is an alias for loan_Staff:

mysql> show create table loan_Staff\G;
*************************** 1. row ***************************
       Table: loan_Staff
Create Table: CREATE TABLE `loan_Staff` (
  `SKey` int(11) NOT NULL AUTO_INCREMENT,
  `Surname` char(100) DEFAULT NULL,
  `Name` char(100) DEFAULT NULL,
  `IDNumber` varchar(30) NOT NULL,
  `EmailAdd` char(128) DEFAULT NULL,
  `TelNo` char(20) DEFAULT NULL,
  `FaxNo` char(20) DEFAULT NULL,
  `Gender` int(2) DEFAULT NULL,
  `PassNo` tinyint(4) NOT NULL DEFAULT '0',
  `DateBirth` char(11) DEFAULT NULL,
  `BlackList` int(2) DEFAULT NULL,
  `Date_DW_Submitted` char(11) DEFAULT '11/11/1111',
  `Date_DW_Imported` char(11) DEFAULT '11/11/1111',
  `DataWash_Status` int(2) DEFAULT '0',
  `DO_BankName` varchar(200) DEFAULT '-1',
  `DO_BankAccNo` varchar(200) DEFAULT '-1',
  `DO_BankBranchCode` varchar(200) DEFAULT '-1',
  `DO_BankAccType` varchar(200) DEFAULT '-1',
  `Date_BW_Submitted` char(11) DEFAULT '11/11/1111',
  `Date_BW_Imported` char(11) DEFAULT '11/11/1111',
  `BankWash_Status` int(2) DEFAULT '0',
  `emergency_stop_all` int(11) NOT NULL DEFAULT '0',
  `death_date` datetime DEFAULT NULL,
  `death_certificate_number` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`SKey`),
  UNIQUE KEY `unq_ls_idnumber` (`IDNumber`),
  KEY `indx_ls_dw` (`DataWash_Status`),
  KEY `indx_ls_dwbwstat` (`DataWash_Status`,`BankWash_Status`),
  KEY `indx_ls_stop` (`emergency_stop_all`),
  KEY `fk_ls_gender` (`Gender`),
  KEY `indx_lsd_deathdate` (`death_date`),
  KEY `indx_lsd_deathcert` (`death_certificate_number`),
  CONSTRAINT `fk_ls_gender` FOREIGN KEY (`Gender`) REFERENCES `gender` (`gender_id`)
) ENGINE=InnoDB AUTO_INCREMENT=412459 DEFAULT CHARSET=utf8
[28 Mar 2012 11:55] Valeriy Kravchuk
So, ... CHARSET=utf8 as I expected. You have either to use the same charset for all tables involved or use explicit cast on the proper side. Index on column can not be used if some function (convert() in this case) is applied to column in the query.
[28 Mar 2012 13:59] Desmond Coertzen
Can we leave this report status on "Need Feedback" ?

We are going to go through the process of porting all db's, tables and columns to UTF-8 and test this again. It will take some time though.
[28 Mar 2012 14:20] Valeriy Kravchuk
Surely. Take as much time as you need to prove that my understanding of the root cause of the problem is correct (or totally wrong).
[29 Apr 2012 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".