Bug #27746 Use Compression is 2x slower
Submitted: 11 Apr 2007 4:19 Modified: 17 Apr 2007 22:31
Reporter: Jared S (Silver Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / NET Severity:S3 (Non-critical)
Version:5.0.5 OS:Windows (Vista)
Assigned to: CPU Architecture:Any
Tags: Connection, remote, use compression

[11 Apr 2007 4:19] Jared S
Description:
Hi, after benchmarking my software for remote connection I have found that setting "use compression=true" in connection string makes general DB hits twice as slow.

I personally don't understand why this is given the power of CPUs on market today.

Maybe Tonci could blog a comment here to help me understand why turning on compression hurts DB hits.

How to repeat:
1. Setup remote DB
2. Hit 20/30 tables returning small results

RESULT : performance is 2x worse than with compression turned off!
[11 Apr 2007 9:03] Tonci Grgin
Hi Jared and thanks for your report. Unfortunately I can't give it a priority right now but will try to test as soon as possible.
[15 Apr 2007 14:06] Tonci Grgin
Jared, this could be true for small datasets where there's nothing to compress... But x2 seems too much so I'll need to do some serious benchmarking tomorrow.
[16 Apr 2007 11:06] Tonci Grgin
Jarred. I did a few tests on my Linux 4.1.23BK from WinXP Pro SP2 client over 25 different tables with business data, each with at least few thousand rows. I used  latest c/NET 5 sources. Can't really verify your findings as my results are not that different. As I don't have Vista 32bit, I tested in Vista VM. Results are the same (maybe few miliseconds in favor of "use compression=true" on small resultsets).

First I checked network:
 - During all tests, network load did not go above 10%
 - Using "use compression=true" was 10 times more efficient than "use compression=false" in terms of traffic.
 - For difference to show, mysql cl client with --compress needs at least 1000 rows fetched. Even then, difference is hardly measureable.

Then I checked processor (P4 2.4, 1GB RAM):
 - Processor hardly feels the difference.
 - Testing same queries, I've found that times are compareable, but when increasing number of records returned, "use compression=false" is winning! I measured each option twice, restarting between changing compression option. This shows processor could be the problem when streams get bigger. So you have a choice, if network is a problem then use compression, if client processor is, then don't.

10 records from tables with at least few 1000 rows up till ~1/4 mil. rows. MyISAM
Test compr. 1st pass:00:00:02.9375000
Test compr. 2nd pass:00:00:02.8593750

Test no-compr. 1st pass:00:00:02.1250000
Test no-compr. 2nd pass:00:00:02.0937500
---
50 records from the same tables
Test compr. 1st pass:00:00:02.9375000
Test compr. 2nd pass:00:00:02.9062500

Test no-compr. 1st pass:00:00:02.1406250
Test no-compr. 2nd pass:00:00:02.1093750
---
250 records from the same tables
Test compr. 1st pass:00:00:03.3281250
Test compr. 2nd pass:00:00:03.1250000

Test no-compr. 1st pass:00:00:02.3281250
Test no-compr. 2nd pass:00:00:02.2968750
---
300 records from the same tables
Test compr. 1st pass:00:00:03.3593750
Test compr. 2nd pass:00:00:03.2812500

Test no-compr. 1st pass:00:00:02.4843750
Test no-compr. 2nd pass:00:00:02.4687500

Sample test code:
    MySqlConnection conn = new MySqlConnection();
    conn.ConnectionString = "DataSource=munja;Database=solusd2;UserID=root;Password=;PORT=3307;use compression=true";
    conn.Open();
    MySqlCommand cmd = new MySqlCommand();
    cmd.Connection = conn;
    cmd.CommandTimeout = 0;
    cmd.CommandType = CommandType.Text;
    System.DateTime started = DateTime.Now;
    Console.WriteLine("Test compr. start:" + started.ToString());
    //1
    cmd.CommandText = "SELECT * FROM dnevnik ORDER BY Konto LIMIT 500";
    MySqlDataReader dr = cmd.ExecuteReader();
    try
        {
            while (dr.Read())
                {}
        }
    finally
        {
            dr.Close();
        }
--<cut>--
    System.DateTime ended = DateTime.Now;
    TimeSpan diff = ended - started;
    Console.WriteLine("Test compr. 1st pass:" + diff.ToString());
    Console.WriteLine("Restarting");
//-------------------------------------

As I don't have profiler, I can't really tell which part of code is not optimized...
[16 Apr 2007 23:35] Jared S
I will log another bug which should show 250+ MS compression delay using MySQL official 'Worlds' database on my private server.
[17 Apr 2007 7:03] Tonci Grgin
Jared, you know the drill :) If I missed something, please reopen this report and tell me, don't open another one.

I've been reading this over and found that I might missinterpret meaning of "remote"... I tested on LAN. The closest thing to "remote" for me is either flaky internet connection to BugsDB or 1 or 2 lines ISDN connection. Do you think I should try that? What about my test case? Similar to what you used? What's your client's computer processor? What's your line info? Did you try using --compress with mysql cl client?

In my expirience, when you use "remote" ISDN connection with header compression turned on you'll get better results than with "use compression=true" on small resultsets (data etc that is not compressed before sending). Check out this discussion: http://www.ietf.org/rfc/rfc2507.txt
[17 Apr 2007 22:31] Jared S
I CAN NOT re-open this bug, appears you have bug in your bug system.  This performance drop I am reporting has NOTHING to do with connection speed or pipe width.  There for you should be able to repeat this bug on you own computer connected to your localhost, and by only flaging use compression = true.  Yes - I did connect to DB using Query Browser and with 'compression checkbox' checked, there was no noticable performace drop, which means this has to be issue with NET connector.  Processors are P4/C2, 250MS per hit seems like to much overhead for simple byte stream decompression.

Please see my new bug http://bugs.mysql.com/bug.php?id=27865.
[18 Apr 2007 5:51] Tonci Grgin
Continued in Bug#27865.