Bug #65508 getCharsetNameForIndex() should be faster
Submitted: 4 Jun 2012 11:28 Modified: 2 Aug 2012 21:22
Reporter: RENAULT Frdric Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S5 (Performance)
Version:5.1.20 OS:Any
Assigned to: Alexander Soklakov CPU Architecture:Any

[4 Jun 2012 11:28] RENAULT Frdric
Description:
Hello,

While profiling my application, I noticed that the function com.mysql.jdbc.getCharsetNameForIndex() takes sometimes 5% of the execution time. 

It can be less, but it can also be worse if there are many columns with a Charset in the answer. ( It's called for each column) 

This function does a seek in a hasmap and two equalsIgnoreCase() 

For me (and probably most of us), this function returns always the same value. ( "UTF-8" in my case) 

From my point of view, a function which returns almost always the same result should not takes so much time.

Fred.

How to repeat:
I've made some tests with a table with 50 columns :

CREATE TABLE `test` (
  `idTest` bigint(20) NOT NULL AUTO_INCREMENT,
  `varchar01` varchar(10) NOT NULL DEFAULT '',
  `varchar02` varchar(10) NOT NULL DEFAULT '',
  `varchar03` varchar(10) NOT NULL DEFAULT '',
  `varchar04` varchar(10) NOT NULL DEFAULT '',
  ...
  `varchar49` varchar(10) NOT NULL DEFAULT '',
  `varchar50` varchar(10) NOT NULL DEFAULT '',
  PRIMARY KEY (`idTest`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8

I've inserted one row and launch "SELECT * FROM test;" for 100 seconds.

If I hardcode "UTF-8" as the result of getCharsetNameForIndex(), I am able to do 4.1% more SELECT.
[2 Aug 2012 21:22] John Russell
Added to changelog for 5.1.22: 

The com.mysql.jdbc.getCharsetNameForIndex() method was made more
efficient, resulting in better performance for queries against tables
containing many columns with string data types.