| Bug #79400 | Uncompress function could be much better | ||
|---|---|---|---|
| Submitted: | 24 Nov 2015 20:15 | Modified: | 17 Dec 2018 13:29 |
| Reporter: | Kyle Lahnakoski | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: DML | Severity: | S3 (Non-critical) |
| Version: | 14.14 Distrib 5.6.27 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[30 Nov 2015 8:58]
MySQL Verification Team
Hello Kyle, Thank yo for the report. Could you please provide exact repeatable test case(schema and subset of data)? You may want to mark it as private after uploading here. If you can provide more information, feel free to add it to this bug and change the status back to 'Open'. Thank you for your interest in MySQL. Thanks, Umesh
[5 Dec 2015 17:38]
Kyle Lahnakoski
Sorry for the delay.
For the breaking test: I expect
SELECT
hex(uncompress(concat(unhex('0000000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5))))
to return '65792F81F95AD61F1A25', not null. This test is compressing a string, and replacing the first four bytes with 0x0000, then decompressing.
For some reason this works (notice the '09...')
SELECT
hex(uncompress(concat(unhex('0900000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5))))
Here is a more comprehensive, but random, test I use to find particular problem sequences:
drop table if exists test_numbers;
create table test_numbers (
value integer
);
INSERT INTO TEST_NUMBERS VALUES (0);
INSERT INTO TEST_NUMBERS VALUES (1);
INSERT INTO TEST_NUMBERS VALUES (2);
INSERT INTO TEST_NUMBERS VALUES (3);
INSERT INTO TEST_NUMBERS VALUES (4);
INSERT INTO TEST_NUMBERS VALUES (5);
INSERT INTO TEST_NUMBERS VALUES (6);
INSERT INTO TEST_NUMBERS VALUES (7);
INSERT INTO TEST_NUMBERS VALUES (8);
INSERT INTO TEST_NUMBERS VALUES (9);
SELECT
substring(a.data, 1) expected,
uncompress(concat(unhex('0000000'), substring(compress(data), 5))) result
FROM
(
select
group_concat(char(round(rand()*256)) separator '') data
from
test_numbers
) a
[13 Nov 2018 20:04]
MySQL Verification Team
miguel@luz:~/dbs $ ./57c
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.25 Source distribution BUILD: 2018-NOV-11
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql 5.7 > CREATE DATABASE e;
Query OK, 1 row affected (0,03 sec)
mysql 5.7 > USE e
Database changed
mysql 5.7 > create table TEST_NUMBERS (
-> value integer
-> );
Query OK, 0 rows affected (0,86 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (1);
Query OK, 1 row affected (0,13 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (2);
Query OK, 1 row affected (0,39 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (3);
Query OK, 1 row affected (0,13 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (4);
Query OK, 1 row affected (0,13 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (5);
Query OK, 1 row affected (0,14 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (6);
Query OK, 1 row affected (0,13 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (7);
Query OK, 1 row affected (0,13 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (8);
Query OK, 1 row affected (0,14 sec)
mysql 5.7 > INSERT INTO TEST_NUMBERS VALUES (9);
Query OK, 1 row affected (0,11 sec)
mysql 5.7 > SELECT
-> hex(uncompress(concat(unhex('0000000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5))));
+--------------------------------------------------------------------------------------------------+
| hex(uncompress(concat(unhex('0000000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5)))) |
+--------------------------------------------------------------------------------------------------+
| NULL |
+--------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0,00 sec)
mysql 5.7 > SHOW WARNINGS;
+---------+------+----------------------------+
| Level | Code | Message |
+---------+------+----------------------------+
| Warning | 1259 | ZLIB: Input data corrupted |
+---------+------+----------------------------+
1 row in set (0,00 sec)
mysql 5.7 > SELECT
-> hex(uncompress(concat(unhex('0900000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5))));
+--------------------------------------------------------------------------------------------------+
| hex(uncompress(concat(unhex('0900000'), substring(compress(unhex('65792F81F95AD61F1A25')), 5)))) |
+--------------------------------------------------------------------------------------------------+
| 65792F81F95AD61F1A25 |
+--------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
[14 Nov 2018 12:38]
MySQL Verification Team
Hi, The length prefix is required for the uncompressing function to know how much memory is required for the uncompressed data. That is how it functions. All that we can do is to document that fact.
[15 Dec 2018 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".

Description: The UNCOMPRESS function is perfectly capable of unzipping blobs of data put into a table by Python's zlib.compress() function, except for those first four bytes containing the little-endian `length` of the uncompressed string. I do not know what role the `length` plays while decompressing, but it is preventing me from accessing the uncompressed values! How to repeat: For some reason, this works on every example I have so far: select id, length(`blob`) `length`, coalesce( uncompress(concat(unhex('0900000'), `blob`)) ) `bytes` from job_artifact Mind you, I only have 34K examples, and none very big. Suggested fix: Please have UNCOMPRESS ignore those first four bytes so magic number 9 is not needed.