Bug #72673 SUBSTRING_INDEX with negative index sometimes returns uninitialized memory
Submitted: 16 May 2014 22:38 Modified: 2 Jun 2014 11:29
Reporter: Arthur O'Dwyer Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:5.5.31, 5.5.34 OS:Any
Assigned to: Tor Didriksen CPU Architecture:Any
Tags: SUBSTRING_INDEX

[16 May 2014 22:38] Arthur O'Dwyer
Description:
Should be self-explanatory...

select hay, needle, i, substring_index(hay, needle, i) from t;

    +----------------+--------+------+---------------------------------+
    | hay            | needle | i    | substring_index(hay, needle, i) |
    +----------------+--------+------+---------------------------------+
    | banananananana | ana    |   -3 | nananana                        |
    | xyz            | xyz    |   -3 | naxyzana                        |
    +----------------+--------+------+---------------------------------+

How to repeat:
drop database if exists db1;
create database db1;
use db1;
create table t (hay blob, needle blob, i int);
insert into t values ('banananananana', 'ana', -3);
insert into t values ('xyz', 'xyz', -3);
select hay, needle, i, substring_index(hay, needle, i) from t;

    +----------------+--------+------+---------------------------------+
    | hay            | needle | i    | substring_index(hay, needle, i) |
    +----------------+--------+------+---------------------------------+
    | banananananana | ana    |   -3 | nananana                        |
    | xyz            | xyz    |   -3 | naxyzana                        |
    +----------------+--------+------+---------------------------------+

The problem is that MySQL's Item_func_substr_index::val_str() sometimes returns this->tmp_value without initializing it; it ends up holding garbage from the previous call.

Suggested fix:
Initialize tmp_value at the top of Item_func_substr_index::val_str().
Alternatively, you could probably get rid of it altogether and just use a local variable inside the function. I don't know why it was originally a member variable persisted across calls; seems to me like that's just asking for trouble.
[17 May 2014 6:01] MySQL Verification Team
Repeatable on 5.5.  Not repeatable on 5.6 or 5.7.

Version: '5.5.37'  socket: 'sock'  port: 3306  MySQL Community Server (GPL)
Invalid read of size 2
at: __GI_memmove (mc_replace_strmem.c:1071)
by: Protocol::net_store_data (protocol.cc:59)
by: Item::send (item.cc:5950)
by: Protocol::send_result_set_row (protocol.cc:847)
by: select_send::send_data (sql_class.cc:2190)
by: end_send (sql_select.cc:12763)
by: evaluate_join_record (sql_select.cc:11944)
by: sub_select (sql_select.cc:11797)
by: do_select (sql_select.cc:11562)
by: JOIN::exec (sql_select.cc:2385)
by: mysql_select (sql_select.cc:2604)
by: handle_select (sql_select.cc:297)
by: execute_sqlcom_select(sql_parse.cc:4625)
by: mysql_execute_command (sql_parse.cc:2176)
by: mysql_parse (sql_parse.cc:5662)
by: dispatch_command (sql_parse.cc:1038)
by: do_handle_one_connection (sql_connect.cc:862)
by: handle_one_connection (sql_connect.cc:781)
by: start_thread (pthread_create.c:309)
by: clone (clone.S:115)
 Address 0x142386fe is 126 bytes inside a block of size 16,504 free'd
at: free (vg_replace_malloc.c:468)
by: mem_heap_block_free (mem0mem.c:517)
by: mem_heap_free_func (mem0mem.ic:500)
by: row_sel_store_mysql_rec (row0sel.c:2728)
by: row_search_for_mysql (row0sel.c:4586)
by: ha_innobase::general_fetch (ha_innodb.cc:6193)
by: rr_sequential (records.cc:461)
by: sub_select (sql_select.cc:11801)
by: do_select (sql_select.cc:11562)
by: JOIN::exec() (sql_select.cc:2385)
by: mysql_select (sql_select.cc:2604)
by: handle_select (sql_select.cc:297)
by: execute_sqlcom_select (sql_parse.cc:4625)
by: mysql_execute_command (sql_parse.cc:2176)
by: mysql_parse (sql_parse.cc:5662)
by: dispatch_command (sql_parse.cc:1038)
by: do_handle_one_connection (sql_connect.cc:862)
by: handle_one_connection (sql_connect.cc:781)
by: start_thread (pthread_create.c:309)
by: clone (clone.S:115)

mysql> select hay, needle, i, substring_index(hay, needle, i) from t;
+----------------+--------+------+---------------------------------+
| hay            | needle | i    | substring_index(hay, needle, i) |
+----------------+--------+------+---------------------------------+
| banananananana | ana    |   -3 | nananana                        |
| xyz            | xyz    |   -3 | nananana                        |
+----------------+--------+------+---------------------------------+
2 rows in set (4 min 20.94 sec)
[2 Jun 2014 11:23] Tor Didriksen
Posted by developer:
 
Fixed in 5.6.5 by the patch for:
Bug#11829861 - SUBSTRING_INDEX() RESULTS "OMIT" CHARACTER WHEN USED INSIDE LOWER()
Bug#60166 SUBSTRING_INDEX() results "omit" character when used inside LOWER()