Bug #11104 | substring_index() will see two delimiters as one under certain conditions | ||
---|---|---|---|
Submitted: | 5 Jun 2005 11:31 | Modified: | 8 Aug 2005 20:53 |
Reporter: | Bob Stein (Candidate Quality Contributor) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 4.1.10, 4.1.12 | OS: | Windows (Windows, Linux) |
Assigned to: | Bugs System | CPU Architecture: | Any |
[5 Jun 2005 11:31]
Bob Stein
[5 Jun 2005 13:26]
Jorge del Conde
Thank you for your bug report. I was able to verify this bug using 5.0.X from bk: mysql> SELECT SUBSTRING_INDEX('the king of the the hill','the',-2) +-------------------------------------------------------+ | SUBSTRING_INDEX('the king of the the hill','the',-2) | +-------------------------------------------------------+ | the hill | +-------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT SUBSTRING_INDEX('the king of the the hill', -> 'the',-2) -> ; +-------------------------------------------------------+ | SUBSTRING_INDEX('the king of the the hill', 'the',-2) | +-------------------------------------------------------+ | king of the the hill | +-------------------------------------------------------+ 1 row in set (0.00 sec) mysql>
[27 Jul 2005 17:14]
Patrick Galbraith
Investigating the 'else' in Item_func_substr_index which handles negative indexes. I suspect String.strstr might be called incorrectly.
[27 Jul 2005 19:20]
Patrick Galbraith
In investigating: Query: mysql> select substring_index('aaaa1', 'aa', -2); +------------------------------------+ | substring_index('aaaa1', 'aa', -2) | +------------------------------------+ | aaaa1 | +------------------------------------+ Item_func_substr_index::val_str: res->Ptr = "aaaa1" res->str_length = 5 delimeter->Ptr = "aa" delimeter->str_length= 2 count= -2 offset= 5 (this means 'aaaa1' pointing to '1') delimiter_length= 2 call to offset= res->strstr("aa", 5) offset= 2 (this means pointing to second 'a' in 'aaaa1') delimeter_length= 2 if(!++count) yields count= -1 offset -= delimiter_length-1 = 2-1= 1 ??? (this means pointing to very first 'a' in 'aaaa1') Why is this being decremented after strstr has already 'found' the location for us? I'm wondering if we need to be checking/counting how many times we've 'found' the search string, so we know that we _don't_ want to decrement this?
[27 Jul 2005 22:12]
Patrick Galbraith
The more I analyse this I see that we decrement by too much - in the case of "the kind of the the hill", we start out with an offset of 24. We call strstr, it finds the first reverse substring of "the", returns an offset of 16, which would be pointing at "the king of the* the hill", we then decrement by delimiter length -1 to 14. It's now pointing at "the king of t*he the hill", we then call strstr again, but it cannot find the second 'the' (in reverse) because we've decremented the offset by too much - there's no way to find it with "the king of th", and the only 'the' it can find is at the beginning!
[27 Jul 2005 22:53]
Patrick Galbraith
Potential fix results: mysql> select substring_index('the king of the the hill', 'the', -2); +--------------------------------------------------------+ | substring_index('the king of the the hill', 'the', -2) | +--------------------------------------------------------+ | the hill | +--------------------------------------------------------+ 1 row in set (0.01 sec) mysql> select substring_index('the king of the.the hill', 'the', -2); +--------------------------------------------------------+ | substring_index('the king of the.the hill', 'the', -2) | +--------------------------------------------------------+ | .the hill | +--------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select substring_index('the king of theorthe hill', 'the', -2); +---------------------------------------------------------+ | substring_index('the king of theorthe hill', 'the', -2) | +---------------------------------------------------------+ | orthe hill | +---------------------------------------------------------+ 1 row in set (0.01 sec) mysql> select substring_index('the king of the the hill', 'the', -2); +---------------------------------------------------------+ | substring_index('the king of the the hill', 'the', -2) | +---------------------------------------------------------+ | the hill | +---------------------------------------------------------+ 1 row in set (0.01 sec) mysql> select substring_index('the king of the the hill', 'the', -2); +----------------------------------------------------------+ | substring_index('the king of the the hill', 'the', -2) | +----------------------------------------------------------+ | the hill | +----------------------------------------------------------+ 1 row in set (0.01 sec) mysql> select substring_index('the king of thethethe hill', 'the', -2); +-----------------------------------------------------------+ | substring_index('the king of thethethe hill', 'the', -2) | +-----------------------------------------------------------+ | the hill | +-----------------------------------------------------------+ 1 row in set (0.02 sec) mysql> select substring_index('the king of thethethe hill', 'the', -3); +-----------------------------------------------------------+ | substring_index('the king of thethethe hill', 'the', -3) | +-----------------------------------------------------------+ | thethe hill | +-----------------------------------------------------------+ 1 row in set (0.01 sec) mysql> select substring_index('the king of thethethe hill', 'the', -4); +-----------------------------------------------------------+ | substring_index('the king of thethethe hill', 'the', -4) | +-----------------------------------------------------------+ | king of thethethe hill | +-----------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select substring_index('the king of thethethe hill', 'the', -1); +-----------------------------------------------------------+ | substring_index('the king of thethethe hill', 'the', -1) | +-----------------------------------------------------------+ | hill | +-----------------------------------------------------------+ 1 row in set (0.00 sec)
[28 Jul 2005 1:06]
Patrick Galbraith
committed changeset 1.1891
[8 Aug 2005 20:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/28034
[8 Aug 2005 20:53]
Patrick Galbraith
patch approved and pushed.