Bug #27746 Use Compression is 2x slower
Submitted: 11 Apr 2007 6:19 Modified: 18 Apr 2007 0:31
Reporter: Jared Sullivan (Silver Quality Contributor)
Status: Can't repeat
Category:Connector/Net Severity:S3 (Non-critical)
Version:5.0.5 OS:Microsoft Windows (Vista)
Assigned to: Tonci Grgin Target Version:
Tags: Connection, remote, use compression

[11 Apr 2007 6:19] Jared Sullivan
Description:
Hi, after benchmarking my software for remote connection I have found that setting "use
compression=true" in connection string makes general DB hits twice as slow.

I personally don't understand why this is given the power of CPUs on market today.

Maybe Tonci could blog a comment here to help me understand why turning on compression
hurts DB hits.

How to repeat:
1. Setup remote DB
2. Hit 20/30 tables returning small results

RESULT : performance is 2x worse than with compression turned off!
[11 Apr 2007 11:03] Tonci Grgin
Hi Jared and thanks for your report. Unfortunately I can't give it a priority right now
but will try to test as soon as possible.
[15 Apr 2007 16:06] Tonci Grgin
Jared, this could be true for small datasets where there's nothing to compress... But x2
seems too much so I'll need to do some serious benchmarking tomorrow.
[16 Apr 2007 13:06] Tonci Grgin
Jarred. I did a few tests on my Linux 4.1.23BK from WinXP Pro SP2 client over 25 different
tables with business data, each with at least few thousand rows. I used  latest c/NET 5
sources. Can't really verify your findings as my results are not that different. As I
don't have Vista 32bit, I tested in Vista VM. Results are the same (maybe few miliseconds
in favor of "use compression=true" on small resultsets).

First I checked network:
 - During all tests, network load did not go above 10%
 - Using "use compression=true" was 10 times more efficient than "use compression=false"
in terms of traffic.
 - For difference to show, mysql cl client with --compress needs at least 1000 rows
fetched. Even then, difference is hardly measureable.

Then I checked processor (P4 2.4, 1GB RAM):
 - Processor hardly feels the difference.
 - Testing same queries, I've found that times are compareable, but when increasing
number of records returned, "use compression=false" is winning! I measured each option
twice, restarting between changing compression option. This shows processor could be the
problem when streams get bigger. So you have a choice, if network is a problem then use
compression, if client processor is, then don't.

10 records from tables with at least few 1000 rows up till ~1/4 mil. rows. MyISAM
Test compr. 1st pass:00:00:02.9375000
Test compr. 2nd pass:00:00:02.8593750

Test no-compr. 1st pass:00:00:02.1250000
Test no-compr. 2nd pass:00:00:02.0937500
---
50 records from the same tables
Test compr. 1st pass:00:00:02.9375000
Test compr. 2nd pass:00:00:02.9062500

Test no-compr. 1st pass:00:00:02.1406250
Test no-compr. 2nd pass:00:00:02.1093750
---
250 records from the same tables
Test compr. 1st pass:00:00:03.3281250
Test compr. 2nd pass:00:00:03.1250000

Test no-compr. 1st pass:00:00:02.3281250
Test no-compr. 2nd pass:00:00:02.2968750
---
300 records from the same tables
Test compr. 1st pass:00:00:03.3593750
Test compr. 2nd pass:00:00:03.2812500

Test no-compr. 1st pass:00:00:02.4843750
Test no-compr. 2nd pass:00:00:02.4687500

Sample test code:
    MySqlConnection conn = new MySqlConnection();
    conn.ConnectionString =
"DataSource=munja;Database=solusd2;UserID=root;Password=;PORT=3307;use
compression=true";
    conn.Open();
    MySqlCommand cmd = new MySqlCommand();
    cmd.Connection = conn;
    cmd.CommandTimeout = 0;
    cmd.CommandType = CommandType.Text;
    System.DateTime started = DateTime.Now;
    Console.WriteLine("Test compr. start:" + started.ToString());
    //1
    cmd.CommandText = "SELECT * FROM dnevnik ORDER BY Konto LIMIT 500";
    MySqlDataReader dr = cmd.ExecuteReader();
    try
        {
            while (dr.Read())
                {}
        }
    finally
        {
            dr.Close();
        }
--<cut>--
    System.DateTime ended = DateTime.Now;
    TimeSpan diff = ended - started;
    Console.WriteLine("Test compr. 1st pass:" + diff.ToString());
    Console.WriteLine("Restarting");
//-------------------------------------

As I don't have profiler, I can't really tell which part of code is not optimized...
[17 Apr 2007 1:35] Jared Sullivan
I will log another bug which should show 250+ MS compression delay using MySQL official
'Worlds' database on my private server.
[17 Apr 2007 9:03] Tonci Grgin
Jared, you know the drill :) If I missed something, please reopen this report and tell me,
don't open another one.

I've been reading this over and found that I might missinterpret meaning of "remote"... I
tested on LAN. The closest thing to "remote" for me is either flaky internet connection to
BugsDB or 1 or 2 lines ISDN connection. Do you think I should try that? What about my test
case? Similar to what you used? What's your client's computer processor? What's your line
info? Did you try using --compress with mysql cl client?

In my expirience, when you use "remote" ISDN connection with header compression turned on
you'll get better results than with "use compression=true" on small resultsets (data etc
that is not compressed before sending). Check out this discussion:
http://www.ietf.org/rfc/rfc2507.txt
[18 Apr 2007 0:31] Jared Sullivan
I CAN NOT re-open this bug, appears you have bug in your bug system.  This performance
drop I am reporting has NOTHING to do with connection speed or pipe width.  There for you
should be able to repeat this bug on you own computer connected to your localhost, and by
only flaging use compression = true.  Yes - I did connect to DB using Query Browser and
with 'compression checkbox' checked, there was no noticable performace drop, which means
this has to be issue with NET connector.  Processors are P4/C2, 250MS per hit seems like
to much overhead for simple byte stream decompression.

Please see my new bug http://bugs.mysql.com/bug.php?id=27865.
[18 Apr 2007 7:51] Tonci Grgin
Continued in Bug#27865.