Bug #54756 Cannot retrieve data from ResultSet by column name from a Sphinx daemon
Submitted: 23 Jun 2010 23:10 Modified: 14 Oct 2010 6:05
Reporter: Miguel Silva Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S3 (Non-critical)
Version:5.1.12 OS:Any (Sphinx searchd daemon supports MySQL binary network protocol)
Assigned to: CPU Architecture:Any
Tags: ResultSet, Sphinx

[23 Jun 2010 23:10] Miguel Silva
Description:
I'm connecting to a Sphinx search daemon using the listen protocol 'mysql41' (MySQL protocol used since 4.1 upto at least 5.1). This daemon answers to MySQL like search queries (SphinxQL) using the MySQL network protocol. Therefor it's possible to query the server using MySQL connector's driver. The program runs a simple SphinxQL query and gets the ResultSet data by the column name. An exception is thrown saying: "Column 'search_id' not found". 
 
The programs runs this simple query (SphinxQL):  
 
select @id as search_id, @weight as search_weight from main WHERE MATCH('hello') 
 
Then it tries to get the search_id from the ResultSet: 
 
int id = result.getInt("search_id"); 
 
On that call the following exception is thrown: 
 
Caused by: java.sql.SQLException: Column 'search_id' not found. 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:910) 
at com.mysql.jdbc.ResultSet.findColumn(ResultSet.java:987) 
at com.mysql.jdbc.ResultSet.getInt(ResultSet.java:2749) 
at org.apache.tomcat.dbcp.dbcp.DelegatingResultSet.getInt(DelegatingResultSet.java:237) 
... 
 
Other methods of getting the column names or the columns indexes' by its names have also failed. The only way to make it work is to get the values by the column indexes directly. 
 
Querying the same Sphinx daemon using the mysql CLI results in perfect result tables with column names. That indicates that the column names information is being returned by the server.

How to repeat:
To repeat that error one needs to install sphinx, configure an index, start the daemon and query it from a java program.

Suggested fix:
	
I took a look into the connector code. During one of my debug sessions I've noticed a fact in MysqlIO class. The boolean fields use41Extensions and has41NewNewProt were set to false. If during the debug I change those flags to true, the above code runs well (without exceptions). The value change must be before calling the method:  
 
Field unpackField(Buffer packet, boolean extractDefaultValues). 
 
I understood that those flags where set during the handshake with the server. Their values change based on what version that the client (driver) understands the server is. IMHO the problem is that the driver uses the server name version (server_version in Handshake Packet[1]) to determine the server capabilities. Seams to me that server_capabilities field would be the correct information to use. On server_version sphinx' Handshake Packet has '0.9.9-release', which is the search engine's version. But looking into Sphinx code I´ve noticed that it signalizes the New 4.1 protocol on server_capabilities. 
 
CLIENT_PROTOCOL_41        512        /* New 4.1 protocol */ 
 
Here is the cpp code of the sphinx' Handshake Packet: 
 
HandleClientMySQL method of searchd.cpp (line 5655): 
    char sHandshake[] = 
        "\x00\x00\x00" // packet length 
        "\x00" // packet id 
        "\x0A" // protocol version; v.10 
        SPHINX_VERSION "\x00" // server version 
        "\x01\x00\x00\x00" // thread id 
        "\x01\x02\x03\x04\x05\x06\x07\x08" // scramble buffer (for auth) 
        "\x00" // filler 
        "\x08\x02" // server capabilities; CLIENT_PROTOCOL_41 | CLIENT_CONNECT_WITH_DB 
 
So, if the use41Extensions and has41NewNewProt flags were set to TRUE based on the server_capabilities (CLIENT_PROTOCOL_41) and not on server_version ('0.9.9-release'), retrieving the ResultSet data using column names would work fine.
[25 Jun 2010 5:51] Sveta Smirnova
Thank you for the report.

Verified as described.

Testsuite fails in SetUp, therefore here is standalone test:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;

public class bug54756 { 

	public static void main(String[] args) throws SQLException {
		Connection con = null;
		String url = "jdbc:mysql://127.0.0.1:9312/";
		String db = "def";
		String driver = "com.mysql.jdbc.Driver";
		String user = "root";
		String pass = "";
		ResultSet rs = null;
		ResultSetMetaData rsmd = null;

		try {
			Class.forName(driver);
		} catch (Exception e) {
			e.printStackTrace();
		}
		con = DriverManager.getConnection(url + db + "?characterEncoding=utf8&maxAllowedPacket=512000", user, pass);
		Statement s = con.createStatement();
		s.execute("select @id as search_id, @weight as search_weight from test1 WHERE MATCH('test')");
		rs = s.getResultSet();
		rsmd = rs.getMetaData();
		int cc = rsmd.getColumnCount();
		for (int i = 1; i <= cc; i ++)
			System.out.println(rsmd.getColumnName(i));
		
		while (rs.next()) {
			System.out.println("Id: " + rs.getInt(1) + ", Weight: " + rs.getInt(2));
		}
		rs.close();

	}
}

C code which works fine:

#include <stdio.h>
#include <mysql.h>
#include <assert.h>
#include <string.h>
#include <time.h>

int main(int argc, char **argv)
{
	MYSQL conn;
	MYSQL_RES *result;
	MYSQL_FIELD *field;
	int OK;
	int count = 0;
	int timeout = 10;
	
	struct tm *current;
	time_t now;

	const char* query1= "select @id as search_id, @weight as search_weight from test1 WHERE MATCH('test')";
	int query1len = strlen(query1);

	mysql_init(&conn);

	if (!mysql_real_connect(&conn, "127.0.0.1", "root", "", "test", 9312, NULL,0)) {
		printf("Error: %s\n", mysql_error(&conn));
		exit(1);
	}

	OK = mysql_real_query (&conn, query1, query1len);
	assert(0 == OK);
	result = mysql_store_result(&conn);
	
	unsigned int num_fields;
	unsigned int i;
	MYSQL_FIELD *fields;
	
	num_fields = mysql_num_fields(result);
	fields = mysql_fetch_fields(result);
	for(i = 0; i < num_fields; i++)
	{
	   printf("Field %u is %s\n", i, fields[i].name);
	}
}
[25 Jun 2010 5:54] Sveta Smirnova
Workaround:

use numeric indexes to access data:

while (rs.next()) {
			System.out.println("Id: " + rs.getInt(1) + ", Weight: " + rs.getInt(2));
}
[28 Jun 2010 7:35] Sveta Smirnova
Same result if use useColumnNamesInFindColumn=true
[28 Jun 2010 9:27] Sveta Smirnova
I modified unpackField so it does not check for use41Extensions and has41NewNewProt and got correct results:

id
weight
search_id
search_weight
Id: 1, Weight: 2421
Id: 2, Weight: 2421
Id: 4, Weight: 1442
[28 Jun 2010 9:30] Tonci Grgin
Miguel, I agree with you and Sveta but there are several things that you should know...

First of all, test1 is not a table, thus Original_table field in metedata returned is not filled which leads to different execution path in c/J (treating rs as coming from function or such). Further more, c/J is highly optimized trying to get most of every server version. As major updates do not just happen in major server releases (check for example utf8mb4 support changes from MySQL server version 5.5.2 to 5.5.3) I do not think c/J will ever depend on cli protocol version.

However, patch is sane and correct so I'll consult Mark.
[28 Jun 2010 9:44] Tonci Grgin
Sveta was kind and checked metadata returned. As I guessed, most relevant fields are empty:
mysql> select @id as search_id, @weight as search_weight from test1 WHERE MATCH('test');
Field   1:  `id`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       DECIMAL
Collation:  latin1_swedish_ci (8)
Length:     20
Max_length: 1
Decimals:   0
Flags:      NUM
[29 Jun 2010 19:25] Andrew Aksyonoff
Tonci, would that be enough to start returning Table and Org_table from Sphinx side to prevent c/J (and perhaps other clients) from confusing?
[30 Jun 2010 8:53] Tonci Grgin
Andrew no, even if that would be possible at all... I am looking into this right now.
[30 Jun 2010 9:10] Andrew Aksyonoff
Tonci, I can make any reasonable changes to Sphinx, just need to know which ones specifically.

Introducing Sphinx daemon as some specific version of MySQL doesn't seem the right thing to do. Fixing this or that subtlety such as Org_table however seems ok. So please let me know.
[30 Jun 2010 9:32] Tonci Grgin
Miguel, Andrew, although Miguel's proposal (So, if the use41Extensions and has41NewNewProt flags were set to TRUE based on the server_capabilities (CLIENT_PROTOCOL_41) and not on server_version ('0.9.9-release')) seems tempting, the protocol version changes very rarely and does not help distinguishing various server capabilities. So it appears to me I am bound to server_version and the only thing that can be done is to just append Sphynx version to MySQL server version. Thus, I suggest that Sphynx devs should change SPHINX_VERSION "\x00" // server version 
to
APPEND(server_version, SPHINX_VERSION) "\x00" // server version 
and things will work.

Please do note that we can not change the MySQL SW to match some 3rd party SW even if it was possible.
[30 Jun 2010 9:48] Andrew Aksyonoff
Tonci, making Sphinx pretend that it's a particular version of MySQL is definitely not the *only* way around this.

For instance, why not check the version string for presence of "MySQL" substring and only enable all the MySQL version based guesswork in case it's present?

So that when the server is not MySQL, you would just behave according to the protocol documentation.
[30 Jun 2010 10:00] Tonci Grgin
Andrew, true but then I would be making changes in our SW to adopt to 3rd party one we do not depend on. This is never a good policy and I think it's obvious why.

However, don't worry, I am not dismissing this yet.
[30 Jun 2010 10:05] Andrew Aksyonoff
Tonci, IMO "conforming to a 3rd party" would only happen if you checked for "Sphinx" (or whatever else) in version string. And I would be the first against *that* as I would rather fix protocol conformance bugs in Sphinx.

However, it seems to me Sphinx is already conforming to your protocol. CLI MySQL client and PHP connectors have no issue talking to it, for instance. That's not proof, of course, but some evidence.

So an extra check for "MySQL" is rather conforming to your own protocol and enabling undocumented hacks only for your own SW, Sphinx per se is outside of this scope.
[30 Jun 2010 10:19] Tonci Grgin
Noted although not sure if I can fix this without causing many code paths to break...
[30 Jun 2010 13:14] Mark Matthews
What's missing in this discussion is that there are protocol-level changes that aren't reflected by flags, and the only way to detect them is through version strings. Support the protocol before 4.1 and after 4.1, and you'll learn about them. Some of them have nothing to do with behavior, but how things are encoded on the wire.
[30 Jun 2010 14:16] Tonci Grgin
Mark, noted. Patch will cover setting of
protected int maxThreeBytes = 255 * 255 * 255;
private boolean use41Extensions = false;
private boolean has41NewNewProt = false;
according to protocol version as we discussed.
[1 Jul 2010 11:08] Tonci Grgin
Preliminary patch done, running tests.
[1 Jul 2010 14:06] Tonci Grgin
Diff file

Attachment: IOdiff.txt (text/plain), 3.82 KiB.

[2 Jul 2010 6:59] Tonci Grgin
"The value change must be before calling the method:   
Field unpackField(Buffer packet, boolean extractDefaultValues)."

I think it should read before "checkErrorPacket" in doHandshake and the patch does this. Miguel, however, I do not see how you got connected at all. Are you using 4.1.0 server or what?
[5 Jul 2010 21:38] Miguel Silva
Tonci, I don't know if I understood your question. I'm using sphinx server. As I told the server's version is 0.9.9-release. This is sphinx' latest stable release. The server is not a MySQL of any version.
[7 Jul 2010 6:08] Tonci Grgin
Thanks for info Miguel. It is obvious to me Sphinx takes a different execution path than MySQL server thus Sveta will put up an instance for me to test with.
[7 Jul 2010 10:02] Tonci Grgin
The patch is perfectly fine but there is a problem in Sphinx response. According to http://forge.mysql.com/wiki/MySQL_Internals_ClientServer_Protocol, both
  CLIENT_PROTOCOL_41	512	/* New 4.1 protocol */
and
  CLIENT_RESERVED         16384   /* Old flag for 4.1 protocol  */
should be set if new 4.1 protocol is in use (as can be observed by sniffing the traffic from real MySQL server).
Sphinx just does not set CLIENT_RESERVED flag thus bypassing my patch. This is important part of what I'm doing as this flag helps me distinguish between 4.1.0 and down and 4.1.1 and up without using server_version.

Although I can code around this problem, it is my opinion that Sphix should behave as close to how MySQL server behaves, especially in handshake, so Sveta will file this as a bug to Sphinx and, after it's fixed, I'll retest my patch.
[7 Jul 2010 10:21] Andrew Aksyonoff
Tonci, to clarify, I should specify both PROTOCOL_41 and RESERVED flags in Sphinx at the same time, correct? Or otherwise it seems to connectors that Sphinx is behaving as some pre-4.1.0 version of MySQL?

And, to be pedantic, nothing at all is mentioned about that RESERVED flag in the wiki except that it's an "Old flag for 4.1 protocol", perhaps the wiki should be fixed as well.
[7 Jul 2010 10:31] Tonci Grgin
Andrew, correct and correct. I just sniffed the traffic and found out what MySQL server returns.
This is just a beginning of a much more comprehensive patch and I do not want to start adding various && and || where not necessary...

MySQL Protocol
  Packet Length: 56
  Packet Number: 0
  Server Greeting
    Protocol: 10
    Version: 5.1.31-log
    Thread ID: 2
    Salt: !84;j(&$
    Server Capabilities: 0xF7FF
      .... .... .... ...1 = Long Password: Set
      .... .... .... ..1. = Found Rows: Set
      .... .... .... .1.. = Long Column Flags: Set
      .... .... .... 1... = Connect With Database: Set
      .... .... ...1 .... = Don't Allow database.table.column: Set
      .... .... ..1. .... = Can use compression protocol: Set
      .... .... .1.. .... = ODBC Client: Set
      .... .... 1... .... = Can Use LOAD DATA LOCAL: Set
      .... ...1 .... .... = Ignore Spaces before '(': Set
      .... ..1. .... .... = Speaks 4.1 protocol (new flag): Set
      .... .1.. .... .... = Interactive Client: Set
      .... 0... .... .... = Switch to SSL after handshake: Not set
      ...1 .... .... .... = Ignore sigpipes: Set
      ..1. .... .... .... = Knows about transactions: Set
      .1.. .... .... .... = Speaks 4.1 protocol (old flag): Set
      1... .... .... .... = Can do 4.1 authentication: Set
  Charset: utf8 COLLATE utf8_general_ci (33)
and so on.
[14 Oct 2010 6:05] Tonci Grgin
Final patch has been pushed as rev. 985. 

Mind you, as Mark said, some values are hardcoded with no reflections in flags or version so, somewhere around line 1167 in MySQLIO there will not be changes to
  if (versionMeetsMinimum(4, 0, 8)) {
and
  this.colDecimalNeedsBump = versionMeetsMinimum(3, 23, 0);
  this.colDecimalNeedsBump = !versionMeetsMinimum(3, 23, 15);
  this.useNewUpdateCounts = versionMeetsMinimum(3, 22, 5);
as the change in "max three bytes" means large packets won't work with MySQLd < 4.1.0.

I hope the patch will prove stable enough for us to keep it.