MySQL Bugs: #66457: Query Hangs Forever With "Copying to tmp table" After Large LOAD DATA INFILE

Bug #66457	Query Hangs Forever With "Copying to tmp table" After Large LOAD DATA INFILE
Submitted:	19 Aug 2012 16:48	Modified:	5 Dec 2013 19:02
Reporter:	Rafael Salomon	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.5.23	OS:	Windows (Server 2003 R2)
Assigned to:	Sveta Smirnova	CPU Architecture:	Any

Description:
We have a database that stores vehicle valuation data.  Once a week we truncate all the tables and import new data with LOAD DATA LOCAL INFILE.  The largest table has around 8 million records.

When we run queries against the newly imported data too soon after the import (within 5 minutes or so), they tend to hang forever.  Looking at the processlist or engine innodb status, they will say "Copying to tmp table".  The longest I have waited to see if the query will finish is around 12 hours (at which point I killed the connection).  Generally killing the query or the connection will fix the problem.  Sometimes the connection needs to be killed multiple times.  Restarting mysql always fixes the problem.

The queries normally run in miliseconds.

How to repeat:
I will be happy to send someone at oracle the schema and sample data as well as more detailed instructions on how to repeat the problem, but I can't post the details in a public forum.

Please, send EXPLAIN results for any such slow query. Had you tried to run ANALYZE TABLE after LOAD? I wonder is old statistics is the reason for non-optimal execution plan used by the optimizer.

Here's the EXPLAIN output from the most recent query that's stalled:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,vo,ref,"PRIMARY,VehicleId,VehicleIDOptionTypeID,VehicleIdSortOrder,VehicleIdVehicleOptionId",VehicleId,4,const,54,"Using where; Using index"
1,SIMPLE,orpa,ref,"PRIMARY,PriceTypeId",PRIMARY,4,kbb-dev.vo.VehicleOptionId,7,
1,SIMPLE,vr,eq_ref,"PRIMARY,RegionId",PRIMARY,4,kbb-dev.orpa.VehicleTypeRegionId,1,
1,SIMPLE,r,eq_ref,"PRIMARY,RegionIdRegionTypeId",PRIMARY,4,kbb-dev.vr.RegionId,1,"Using where"
1,SIMPLE,pt,eq_ref,PRIMARY,PRIMARY,4,kbb-dev.orpa.PriceTypeId,1,

The query is still stalled on the server, so I'd be happy to do anything else that might be helpful.

Also, I think you're right about the ANALYZE TABLE part.  I'll have to do more testing, but it seems that if I run an ANALYZE TABLE on the largest table right after the LOAD INFILE, the problem does not occur.

The query with this EXPLAIN output can not hand in "Copying to tmp table" status... Anyway, if you'll see stalled query again, please, get the output of:

show engine innodb status\G

also.

Here's the innodb status that I captured the last time it happened:

=====================================
120816 21:22:05 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 46 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 76290 1_second, 76290 sleeps, 7344 10_second, 3758 background, 3757 flush
srv_master_thread log flush and writes: 76487
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 25937, signal count 83014
Mutex spin waits 322633, rounds 708203, OS waits 3238
RW-shared spins 41482, rounds 599908, OS waits 16887
RW-excl spins 21189, rounds 289261, OS waits 5404
Spin rounds per wait: 2.20 mutex, 14.46 RW-shared, 13.65 RW-excl
------------
TRANSACTIONS
------------
Trx id counter 1E99B776
Purge done for trx's n:o < 1E99B73D undo n:o < 0
History list length 2488
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 292, OS thread handle 0x1c18, query id 11135102 localhost 127.0.0.1 root
SHOW ENGINE INNODB STATUS
---TRANSACTION 1E99B5F2, not started
MySQL thread id 285, OS thread handle 0x19d0, query id 10244844 outgoing-1.dealerslink.com 130.94.19.113 cf-dealersites
---TRANSACTION 1E99B706, not started
MySQL thread id 272, OS thread handle 0xc70, query id 11132690 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B621, not started
MySQL thread id 271, OS thread handle 0x1dbc, query id 10245035 outgoing-1.dealerslink.com 130.94.19.113 cf-dealersites
---TRANSACTION 1E99B769, not started
MySQL thread id 270, OS thread handle 0xc2c, query id 11135081 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B6BA, not started
MySQL thread id 240, OS thread handle 0x1d5c, query id 11089480 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B75E, not started
MySQL thread id 232, OS thread handle 0x1978, query id 11135101 outgoing-1.dealerslink.com 130.94.19.113 cf-dealersites
---TRANSACTION 1E99B6D0, not started
MySQL thread id 222, OS thread handle 0x1560, query id 11128546 outgoing-1.dealerslink.com 130.94.19.113 cf-dealersites
---TRANSACTION 1E99B747, not started
MySQL thread id 94, OS thread handle 0x1e64, query id 11132880 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B75B, not started
MySQL thread id 92, OS thread handle 0x1be4, query id 11135097 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B768, not started
MySQL thread id 80, OS thread handle 0x1ea4, query id 11135087 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B475, not started
MySQL thread id 50, OS thread handle 0x19c4, query id 10244018 outgoing-1.dealerslink.com 130.94.19.113 coldfusion
---TRANSACTION 1E99B736, not started
MySQL thread id 25, OS thread handle 0x1c5c, query id 11132817 localhost 127.0.0.1 coldfusion
---TRANSACTION 1E99B775, ACTIVE 14 sec fetching rows, thread declared inside InnoDB 281
mysql tables in use 5, locked 0
MySQL thread id 162, OS thread handle 0x1aa4, query id 11135095 localhost 127.0.0.1 coldfusion Copying to tmp table
SELECT vo.VehicleOptionId
			, orpa.PriceAdjustment, orpa.PriceTypeId
			, pt.PriceTypeId, pt.DisplayName AS PriceType
		FROM vehicleoption vo
			INNER JOIN optionregionpriceadjustment orpa ON (vo.VehicleOptionId = orpa.VehicleOptionId)
			INNER JOIN pricetype pt ON (orpa.PriceTypeId = pt.PriceTypeId)
			INNER JOIN vehicleregion vr ON (orpa.VehicleTypeRegionId = vr.VehicleTypeRegionId)
			INNER JOIN region r ON (vr.RegionId = r.RegionId)
		WHERE 0=0
			AND vo.VehicleId = 353523
			AND r.RegionTypeId = 5
		ORDER BY vo.VehicleOptionId
Trx read view will not see trx with id >= 1E99B776, sees < 1E99B776
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (read thread)
I/O thread 4 state: wait Windows aio (read thread)
I/O thread 5 state: wait Windows aio (read thread)
I/O thread 6 state: wait Windows aio (write thread)
I/O thread 7 state: wait Windows aio (write thread)
I/O thread 8 state: wait Windows aio (write thread)
I/O thread 9 state: wait Windows aio (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
295860 OS file reads, 1216590 OS file writes, 350934 OS fsyncs
106.26 reads/s, 16384 avg bytes/read, 2.37 writes/s, 0.50 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 1785, seg size 1787, 40144 merges
merged operations:
 insert 166405, delete mark 294407, delete 160974
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 2668781, node heap has 2282 buffer(s)
1691.64 hash searches/s, 2230.39 non-hash searches/s
---
LOG
---
Log sequence number 297459251961
Log flushed up to   297459251961
Last checkpoint at  297459251961
0 pending log writes, 0 pending chkp writes
318892 log i/o's done, 0.33 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 683130880; in additional pool allocated 0
Dictionary memory allocated 2205667
Buffer pool size   41152
Free buffers       0
Database pages     38870
Old database pages 14328
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 581815, not young 0
148.78 youngs/s, 0.00 non-youngs/s
Pages read 295844, created 114007, written 874796
106.26 reads/s, 0.02 creates/s, 1.96 writes/s
Buffer pool hit rate 992 / 1000, young-making rate 12 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.04/s, Random read ahead 0.00/s
LRU len: 38870, unzip_LRU len: 0
I/O sum[4613]:cur[387], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
1 queries inside InnoDB, 0 queries in queue
2 read views open inside InnoDB
Main thread id 2664, state: waiting for server activity
Number of rows inserted 13355439, updated 171143, deleted 150565, read 321943255
0.13 inserts/s, 0.02 updates/s, 0.00 deletes/s, 3903.39 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

I'm not sure if this helps, but from my experience it's only the very first time the query runs after the LOAD INFILE that it hangs.  After that it will behave normally.

Thank you for the feedback.

I think fragmentation can matter. Please send us all information you promised in hidden comment.

I uploaded bug-data-66457.zip to your FTP server.  Let me know if you need anything else.  I can reproduce the bug pretty consistently.

Thank you for the feedback.

I can not find the file. Have you uploaded it to ftp://ftp.oracle.com/support/incoming/?

Here are the instructions that I followed for uploading the file:
"If the data you need to attach is more than 500KB, you should create a compressed archive of the data and a README file that describes the data with a filename that includes the bug number (example: bug-data-66457.zip), and use FTP (login with the userid anonymous and your email address) to upload the archive to ftp://ftp.oracle.com/support/incoming/. Once you have uploaded the file, add a comment to this bug to notify us about it. Note: This directory is unlistable, which means that once you have uploaded your file, you will not be able to see it. By default, all files will be deleted after 21 days with 2 advanced email warnings."

Thank you for the feedback.

I can not find this file for some reason. Please upload second time. Please also send us your configuration file.

i uploaded bug-data-66457.zip again

I got the file, thank you!

Thank you for the feedback.

I can not repeat described behavior. Please try with current version 5.5.27 and if problem still exists try to check if this is not fragmentation issue, because I saw you don't use option --innodb_file_per_table and with large ibdata this can happen. So fro test:

1. Start second instance of MySQL server on your machine, but with separate datadir, port and other options which must be unique.
2. Other options, such as InnoDB log file size, etc. leave same
3. Run test you sent to us

If problem still exists indicate exact MySQL package you use (file name you downloaded).

Not sure if this helps, but we experienced the same problem with the previous version we were running (5.5.17 I think), so the problem is not just with 5.5.23.  Unfortunately I can't run the test you're suggesting since the machine is our production machine.  Also, since the workaround of running 'ANALYZE TABLE' after the import fixes the problem, the issue is no longer urgent for my employer.  I'll be happy to get you any information from our existing installation, if that helps any.

Thank you for the feedback.

We don't backport bugfixes to already released versions, therefore it is interesting if problem is repeatable with newer version. Also, if "ANALYZE TABLE" fixes this, I think my guess about issue happens, because fragmentation, can be correct.

So, please, check when you occasionally upgrade to 5.5.27 or some future version and update the report.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I have experienced this problem too and found out that certain firewalls and/or firewall settings (outbound/inbound protection) might contribute to this problem.
Disable your firewall completely (if possible) and when this problem disappears, the firewalll is the cause.
I got this problem after the (standard) installation of ZoneAlarm. After correctly configuring this firewall, the problem disappeared.

John,

thank you for the feedback. In your case it was not result of a bug in MySQL code. I will close the report as "Can't repeat".