Bug #14682 | big LOAD DATA causes too many connections | ||
---|---|---|---|
Submitted: | 6 Nov 2005 18:12 | Modified: | 4 Oct 2006 23:24 |
Reporter: | Rick James | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S5 (Performance) |
Version: | 4.0.23-4.0.26 | OS: | FreeBSD (FreeBSD 4.8) |
Assigned to: | CPU Architecture: | Any |
[6 Nov 2005 18:12]
Rick James
[6 Nov 2005 20:59]
Rick James
The same happens just when DELETE on that big table commences before the LOAD even started. So basicly ending of any massive operation on that huge table (5GB) triggers that. It looks like db hangs on the "cleanup after the big operation" stage.
[7 Nov 2005 7:18]
Valeriy Kravchuk
Thank you for a problem report. Please, send the results (or, better, upload them as file) of the SHOW PROCESSLIST command executed when you have this hang. Send the content of your my.cnf file too. Have you tried to work with newer versions (4.0.26)? Have you seen similar problems on other operating systems? If you built the server yourself from sources, please, inform about the configure options used.
[7 Nov 2005 19:07]
Rick James
Have not tried any newer versions. Do you know of any possible fixes? Can I send processlist some other way? It may contain sensitive info. Here is my.cnf: # -*- sh -*- [mysqld] port = 3306 socket = /tmp/mysql.sock user = mysql skip-locking set-variable = max_connect_errors=999999999 basedir = /home/y datadir = /home/y/var/mysql/data log-error = /home/y/logs/mysql/mysqld.err tmpdir = /tmp/ pid-file = /home/y/var/mysql/mysqld.pid skip-bdb set-variable = thread_cache=100 ## Used in full joins (when indexes are not invoved). This should be ## used only when adding an index is impractical. set-variable = join_buffer=1M ## key_buffer is ised to cache index blocks (MyISAM only). ## This is the best use of memory other than the OS filesystem cache ## This helps out sequential scans (such as a full table scan in an ## query which does not (or can not) use an index. set-variable = record_buffer=1M ## Each thread that needs to do a sort will allocate a buffer of this ## size. Used in ORDER BY queries. set-variable = sort_buffer=1M ## What's the largest a temporary table may be? Used in GROUP BY ## queries. set-variable = tmp_table_size=32M ## table_cache limits how many tables mysqld will keep open. ## Each table requires two file desciptors. ## wait_timeout prevents folks from holding idle connections for very long. ## They can reconnect if they need. ## Note: Raising wait_timeout or max_connections may require ## raising the other. ## avoid FreeBSD DNS problems skip-name-resolve ## IP vs Hostname (also skip-host-cache) ## avoid "niced" "threads" on FreeBSD skip-thread-priority ## max_connections is the total number of open connections we're ## willing to handle at a given time. ## If we set this too high, we could run out of file descriptiors. ## max_user_connectionslimits the number of connections that any one ## user can have to the server at a given time. ## Prevents one runaway from slamming the ## server and using up all available connections. ## replication -- how a slave sees the master: ##master-host = db.finance.yahoo.com ##master-user = repl ##master-password = C0pyIT ##master-connect-retry = 15 set-variable = slave_net_timeout=60 ## Don't bother replicating these: binlog-ignore-db=CapIQ binlog-ignore-db=Log #binlog-ignore-table=IDX.Dirty #binlog-ignore-table=IDX.Dist #binlog-ignore-table=IDX.Q4Quotes ## (used to include Moreover) # Master sends data to slave's IO_thread, which writes it to relay-log: # These are needed for Backup Master and Relay slave: relay-log = /home/y/var/mysql/repl/relay-log relay-log-index = /home/y/var/mysql/repl/relay-index relay-log-info-file = /home/y/var/mysql/repl/relay-log.info # Relay slave records some databases into the binary log ## RELAY: Acting as intl-east.db.finance, we need these: ####binlog-do-db=Yoda ####binlog-do-db=Live ####binlog-do-db=Datascope ####binlog-do-db=Chart ####binlog-do-db=CoreData log-bin = /home/y/var/mysql/repl/binary-log log-bin-index = /home/y/var/mysql/repl/index log-slave-updates ## End RELAY ## Tuned InnoDB settings for Finance Feeds innodb_data_file_path = ibdata1:4000M;ibdata2:4000M;ibdata3:4000M;ibdata4:4000M;ibdata5:4000M;ibdata6:100M:autoextend innodb_data_home_dir = /home/y/var/mysql/ibdata/ innodb_log_group_home_dir = /home/y/var/mysql/iblog/ innodb_log_arch_dir = /home/y/var/mysql/iblog/ # Probably optimal for RAID: innodb_flush_log_at_trx_commit = 1 innodb_log_archive = 0 set-variable = innodb_mirrored_log_groups=1 set-variable = innodb_log_files_in_group=3 set-variable = innodb_log_file_size=16M set-variable = innodb_log_buffer_size=8M set-variable = innodb_buffer_pool_size=800M set-variable = innodb_additional_mem_pool_size=4M set-variable = innodb_file_io_threads=4 set-variable = innodb_lock_wait_timeout=50 ## Tuned parameters for Finance Feeds set-variable = key_buffer=700M set-variable = max_allowed_packet=16M set-variable = table_cache=600 set-variable = sort_buffer=1M set-variable = net_buffer_length=8K set-variable = max_connections=500 set-variable = max_user_connections=150 set-variable = myisam_sort_buffer_size=32M set-variable = wait_timeout=40 set-variable = thread_concurrency=6 set-variable = long_query_time=1 log-slow-queries query_cache_size = 4M query_cache_type = 2 # This is set in /home/y/var/mysql/my.cnf server-id = 999 ## End Tuned. ## SSL Support ## Wired off until fixed (not critical to Finance) #ssl-ca=/home/y/etc/mysql/cacert.pem #ssl-cert=/home/y/etc/mysql/server-cert.pem #ssl-key=/home/y/etc/mysql/server-key.pem [myisamchk] set-variable = key_buffer=200M set-variable = sort_buffer=100M set-variable = read_buffer=10M set-variable = write_buffer=10M # In case things tacked on... [mysqld] server-id = 126
[8 Nov 2005 11:57]
Valeriy Kravchuk
Thank you for the additional information. I will check for any possible fixes in a newer versions. You may upload your SHOW PROCESSLIST results as a private file using the File tab. It will be visible to MySQL developers only. Please, try.
[8 Nov 2005 17:32]
Rick James
can you make the last submission private?
[8 Nov 2005 17:44]
Rick James
I have dozens more, but most are longer than 200K. While I complain about most processes being in "closing" or "opening", sometimes there is a large number of "Lock" cases. But note that in most cases, the "time" is many seconds even though very few of the commands should take even one second. Actions on CapIQ tables is very long because its tables are very large.
[8 Nov 2005 18:15]
Valeriy Kravchuk
Thank you for the additional information. All files you submitted are private now. Let me analyze this for some time.
[13 Nov 2005 13:05]
Valeriy Kravchuk
Thank you for the additional information. I have a couple of questions and one suggestion for you. What version of FreeBSD do you use (I had to ask it earlier...)? Have you compiled 4.0.23 yourself or used MySQL supplied binaries? And, please, try to upgrade to 4.0.26. I was no able to find a change that may influent your case directly, but the fix of bug #10600 in 4.0.26 makes this upgrade desirable, I think.
[17 Nov 2005 18:01]
Rick James
This morning's hiccup came at the end of a long (3 hour) InnoDB DELETE: Shortly before the huccup, PROCESSLIST had this for the dying item: 25132825 yahoo 216.109.124.67:46680 CapIQ Query 13859 end4 DELETE FROM FinancialData \n USING FinancialData,FinancialPeriod__\n WHERE \n FinancialData.financialPeriodID\n =\n FinancialPeriod__.financialPeriodID There was also a LOAD DATA running thru the time. Longest running processes: SUMMARY of processes running on Master: Seconds ID (kill) Host Action Table(s) ------- --------- ----------- ------- -------- 14084 25132825 proc20..dcn end4 FROM FinancialData 290 25214036 biz11..scd closing REPLACE INTO AutoReport 290 25214040 biz10..scd closing REPLACE INTO AutoReport 290 25195865 proc12..scd closing FROM Headlines H, S2H X 290 25214050 proc20..dcn INTO TABLE CapIQ.FinancialPeriod 290 25195188 proc21..dcn statisti FROM NewsFlash 290 25196620 proc13..scd closing REPLACE INTO IDX.HeadlineTransaction 290 25212698 proc20..dcn Copying FROM MultexDocToTicker,MultexReports 290 25212404 db26..dcn Updating UPDATE Data 290 25212879 proc13..scd closing REPLACE INTO IDX.HeadlineTransaction 289 25195892 proc14..scd statisti FROM NewsFlash 289 25195934 proc10..scd update REPLACE INTO IDX.HeadlineTransaction 289 25102916 biz23..dcn update into BizRedirects values (1132247198, 1, 'fpc 288 25214052 mbl-gtw2.oa. closing REPLACE INTO AutoReport 288 13463095 proc10..scd Locked LOCKED FROM Data 288 24806527 db26..dcn statisti FROM Data 288 24806359 db26..dcn statisti FROM Data 288 25214051 biz26..dcn update INTO NotFound (url,ref,hostname) 288 25214053 mbl-gtw4.oa. update REPLACE INTO AutoReport 288 25214054 biz12..scd Locked LOCKED INTO NotFound (url,ref,hostname) 288 25214063 proc21..dcn Sending FROM ArticleHtmlFiles 287 24806642 db26..dcn closing FROM Data 287 25214064 biz20..dcn Locked LOCKED REPLACE INTO AutoReport 285 24806549 db26..dcn statisti FROM Data 285 25214066 biz13..scd Locked LOCKED INTO NotFound (url,ref,hostname) 283 25135189 biz30..mud Locked LOCKED into BizRedirects values (1132247201, 283 25214068 feed5.news.d Locked LOCKED REPLACE INTO AutoReport 283 25214074 proc12..scd closing FROM Locks 283 25214076 proc14..scd Locked LOCKED REPLACE INTO AutoReport 282 25083073 biz33..mud Locked LOCKED into BizRedirects values (1132247205, 282 25214077 biz33..mud Locked LOCKED INTO NotFound (url,ref,hostname) 281 25210240 proc12..scd closing INTO Headlines 281 25214078 biz32..mud closing FROM FeedInfo ... 267 25213363 tempest.corp Opening INTO AdminAccess.UsageLog ( 267 25214303 biz12..scd Opening REPLACE INTO AutoReport 266 13818888 beta3..scd Opening from tstamp 266 25214304 biz21..dcn Opening REPLACE INTO AutoReport 266 21373849 ichart11..dc Opening replace into UpdatedSymbols (Symbol, Source) 266 25185209 biz10..scd Opening into BizRedirects values (1132247220, 10, 'se 265 25212788 wire3.nuq Opening INTO FeedData.RawData 265 25214305 biz12..scd Opening FROM FeedInfo ... Locks: Database Table In_use Name_locked --------------- ---------------- -- --- Finance Locks 175 0 DbDist AutoReport 16 0 IDX NewsFlash 2 0 CapIQ FinancialPeriod 1 0 Log NotFound 11 0 DbDist Data 12 0 IDX S2H 1 0 IDX Dist 1 0 Reports MultexReports 1 0 IDX Headlines 2 0 IDX HeadlineTransaction 3 0 Yoda Data 3 0 IDX Dirty 1 0 CapIQ FinancialPeriod__ 1 0 IDX ArticleHtmlFiles 1 0 FeedData Item 42 0 Finance BizRedirects 6 0 Reports MultexDocToTicker 1 0 FeedData RawData 1 0 IDX Nab 3 0 Chart UpdatedSymbols 1 0 IDX FeedInfo 6 0 MySQL_Admin Heartbeat 3 0 CapIQ FinancialData 1 0 (738 others) 0 0 [I have not upgraded server yet -- not something to do casually in a 24/7 master.]
[18 Nov 2005 16:59]
Valeriy Kravchuk
Thank you for the additional information. I am not able to analyze all that really soon, but I still had not got the answer to the simple question in the previous comment: What version of FreeBSD do you use? I want to figure out are there any possible problems with threads implementation...
[18 Nov 2005 21:33]
Rick James
FreeBSD tempest.corp.yahoo.com 4.8-YAHOO-20030628 FreeBSD 4.8-YAHOO-20030628 #1: Mon Mar 29 16:26:49 PST 2004 i386 Note: linuxthreads is used. (Jeremy Cole jcole@yahoo-inc.com may be able to explain further.)
[15 Dec 2005 17:22]
Rick James
Does "end4" give you any clues?... SUMMARY of processes running on Master: Seconds ID (kill) Host Action Table(s) ------- --------- ----------- ------- -------- 14257 7665799 proc21..dcn end4 FROM FinancialData 338 1151588 proc10..scd closing UPDATE Data 338 7300405 db26..dcn closing FROM Data 338 7750843 proc20..dcn closing REPLACE INTO Data 338 7751030 proc21..dcn INTO TABLE CapIQ.FinancialPeriod 338 7748066 db26..dcn closing UPDATE Data ...
[16 Dec 2005 19:03]
Rick James
Or maybe Reopen: SUMMARY of processes running on Master: Seconds ID (kill) Host Action Table(s) ------- --------- ----------- ------- -------- 309 8189088 proc10..scd Opening FROM FeaturedArticle.Schedule S, 309 8189178 proc20..dcn INTO TABLE CapIQ.FinancialPeriod 271 8189310 proc20..dcn init FROM Dist 251 8189440 biz20..dcn closing INTO NotFound (url,ref,hostname) 251 8189441 biz31..mud closing INTO NotFound (url,ref,hostname) 249 8189450 mbl-gtw1.oa. closing REPLACE INTO AutoReport 249 8189451 proc21..dcn closing REPLACE INTO AutoReport 249 8189453 p14.oa.scd Reopen t REPLACE INTO AutoReport 248 8189457 biz22..dcn Reopen t INTO NotFound (url,ref,hostname) 246 8189458 biz13..scd closing FROM FeedInfo 246 8189459 biz26..dcn closing INTO NotFound (url,ref,hostname) Types: 346 Active 2 Binlog Dump 2 Connect 31 Sleep Infos: 1 34 Opening table 72 Opening tables 2 Reopen tables 1 Waiting for tables 191 closing tables 1 init 41 logging slow query 9 undef 3 update 8189453 dbdist_report@p1 Query 249 Reopen table REPLACE INTO AutoReport SET Hostname = 'p14.oa.scd.yahoo.com', Pid = 43736, Frequency = '60', ClientVersion = '2.38', TargetGroup = 'mobile-news', PingTime = 1134750324, CursorId = '171022565', CursorInsertTime = '1134750247', CursorUpdateTime = 1134750253 CREATE TABLE `AutoReport` ( `TargetGroup` varchar(40) NOT NULL default '', `Hostname` varchar(40) NOT NULL default '', `Pid` int(10) unsigned NOT NULL default '0', `PingTime` int(10) unsigned NOT NULL default '0', `Frequency` int(10) unsigned NOT NULL default '0', `CursorId` int(10) unsigned NOT NULL default '0', `CursorInsertTime` int(10) unsigned NOT NULL default '0', `CursorUpdateTime` int(10) unsigned NOT NULL default '0', `ClientVersion` varchar(10) NOT NULL default '', `CursorTime` int(10) unsigned NOT NULL default '0', UNIQUE KEY `TargetGroup` (`TargetGroup`,`Hostname`), KEY `Hostname` (`Hostname`) ) TYPE=MyISAM;
[22 Dec 2005 18:16]
Valeriy Kravchuk
According to the manual (http://dev.mysql.com/doc/refman/5.0/en/show-processlist.html, for example): "Reopen table The thread got a lock for the table, but noticed after getting the lock that the underlying table structure changed. It has freed the lock, closed the table, and is trying to reopen it." So, is it possible that table was altered (ALTER TABLE ...) during that REPLACE statement execution? As for 'end4' state, I have no clue yet. There is no such state in current sources: [openxs@Fedora sql]$ grep -n Reopen *.cc slave.cc:4783: // The master has switched to a new log file; Reopen the old log file sql_base.cc:1386:** Reopen an table because the definition has changed. The date file for the sql_base.cc:1501: Reopen all tables with closed data files sql_base.cc:1660: thd->proc_info="Reopen tables"; [openxs@Fedora sql]$ grep -n end4 *.cc [openxs@Fedora sql]$
[22 Dec 2005 18:59]
Rick James
It is extremely unlikely that either of those two tables was being ALTERed at that time. Perhaps this SHOW TABLE STATUS 'proves' that it was not happening on 12/20 to AutoReport: Name AutoReport Type MyISAM Row_format Dynamic Rows 252 Avg_row_length 73 Data_length 18512 Max_data_length 4294967295 Index_length 21504 Data_free 0 Auto_increment Create_time 2004-11-05 00:45:01 Update_time 2005-12-22 10:26:46 Check_time 2005-12-18 03:25:27 Create_options Comment Or does CHECK count? (I doubt if a CHECK was done at that time of day.) The PROCESSLIST with "Reopen"s was taken about 2005-12-16 8:32 AM (server time is Pacific). Binlog was not rolling over at that time: ... -rw-rw---- 1 mysql users 1073742146 Dec 16 03:16 binary-log.2146 -rw-rw---- 1 mysql users 1073742078 Dec 16 12:42 binary-log.2147 ... TRUNCATE TABLE NotFound is done on Mondays (only). But the max_connections problem happens about twice a day.
[25 Dec 2005 7:24]
Valeriy Kravchuk
Is it possible that mysqldump is performed for the databases during that long-running operations with big tables? Please, provide OS-level disk I/O statistics for your server after that long-running operations? I mean, the results of the proper iostat and/or gstat commands.
[27 Dec 2005 4:49]
Rick James
The system was upgraded to 4.0.26 29 days ago (about Nov 27). Problem continues to occur. mysqldump is rarely used (against the master), and not likely to have been used at when the spikes occurred. Am sending disk/cpu/mem/etc stats around the several spikes separately.
[26 Jan 2006 11:21]
Valeriy Kravchuk
Thank you for the additional information. I can neither repeat the load needed, nor explain the exact reasons for the results you got. So, I'll put this bug report back to "Open" status for somebody else to look at it. I can recommend you to upgrade to MySQL 4.1.18, as soon as it will be released, and try to use it. But I can not point out any particular reason for this problem to disappear.
[4 Sep 2006 23:24]
MySQL Verification Team
Please test with the 4.1 latest release according how Valeriy asked before. http://dev.mysql.com/downloads/mysql/4.1.html Thanks in advance.
[5 Oct 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".