Bug #31163 MYSQL_OPT_READ_TIMEOUT and MYSQL_OPT_WRITE_TIMEOUT delay clarification needed
Submitted: 24 Sep 2007 2:25 Modified: 3 May 2008 23:12
Reporter: Sean Pringle Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.0 OS:Any
Assigned to: Paul DuBois CPU Architecture:Any

[24 Sep 2007 2:25] Sean Pringle
Description:
mysql_options(MYSQL_OPT_READ_TIMEOUT, <delay>) can result in three (3) times the connection failure delay expected based solely on reading the manual:

http://dev.mysql.com/doc/refman/5.0/en/mysql-options.html

libmysql/net.c, my_real_read() appears to perform roughly the following:

1. Attempt an initial socket read (fail).
2. Change socket blocking mode; retry once (fail).
3. Retry again based on net->retry_count (fail).
4. Abort.

net->read_timeout seconds delay is accumulated for each failure.  Similar behavior appears applicable to MYSQL_OPT_WRITE_TIMEOUT in net_real_write() referencing net->write_timeout.

How to repeat:
See attached testcase.

- Connect to remote server.
- Call mysql_options(MYSQL_OPT_READ_TIMEOUT, <delay>)
- Unplug network cable.
- Wait 3*delay seconds.
- See failure.

Suggested fix:
Retries are fine, obviously.

Perhaps just mention specifically in the manual that MYSQL_OPT_READ_TIMEOUT specifies the delay for *each* read attempt and that there will be more than one (1) attempt if the first try fails.

Same for MYSQL_OPT_WRITE_TIMEOUT?

Possibly also expose net->retry_count to mysql_options()  (couldn't see where this was done).
[24 Sep 2007 2:26] Sean Pringle
read_timeout test case

Attachment: testcase.zip (application/zip, text), 1.68 KiB.

[24 Sep 2007 2:26] Sean Pringle
read_timeout debug trace with timestamps

Attachment: trace (application/octet-stream, text), 12.12 KiB.

[24 Sep 2007 3:32] Valeriy Kravchuk
Thank you for a reasonable documentation request.
[30 Nov 2007 14:06] Magnus Blåudd
Trace file from running the attached testprogram showing the interesting part where the client first writes the query to the network(without noticing there is an error since the other side has silently disappeared). Then it reads 3 times as reported.

<mysql_real_connect
[enter] after unplugging the network cable.
>mysql_real_query
| enter: handle: 0x835bfb0
| query: Query = 'select now()'
| >mysql_send_query
| | enter: rpl_parse: 0  rpl_pivot: 1
| <mysql_send_query
| >cli_advanced_command
| | >net_clear
| | <net_clear
| | >net_write_command
| | | enter: length: 12
| | <net_write_command
| | >net_flush
| | | >vio_is_blocking
| | | | exit: 1
| | | <vio_is_blocking
| | | >net_real_write
| | | | >vio_write
| | | | | enter: sd: 3  buf: 0x8360468  size: 17
| | | | | exit: 17
| | | | <vio_write
| | | <net_real_write
| | <net_flush
| | exit: result: 0
| <cli_advanced_command
<mysql_real_query
>cli_read_query_result
| >vio_is_blocking
| | exit: 1
| <vio_is_blocking
| >vio_read_buff
| | enter: sd: 3  buf: 0x8360468  size: 4
| | >vio_read
| | | enter: sd: 3  buf: 0x835c448  size: 16384
| | | vio_error: Got error 11 during read
| | | exit: -1
| | <vio_read
| <vio_read_buff
| info: vio_read returned -1  errno: 11
| >vio_blocking
| | enter: set_blocking_mode: 1  old_mode: 1
| | exit: 0
| <vio_blocking
| >vio_read_buff
| | enter: sd: 3  buf: 0x8360468  size: 4
| | >vio_read
| | | enter: sd: 3  buf: 0x835c448  size: 16384
| | | vio_error: Got error 11 during read
| | | exit: -1
| | <vio_read
| <vio_read_buff
| info: vio_read returned -1  errno: 11
| >vio_read_buff
| | enter: sd: 3  buf: 0x8360468  size: 4
| | >vio_read
| | | enter: sd: 3  buf: 0x835c448  size: 16384
| | | vio_error: Got error 11 during read
| | | exit: -1
| | <vio_read
| <vio_read_buff
| info: vio_read returned -1  errno: 11
(null): read looped with error 11, aborting thread
| error: Couldn't read packet: remain: 4  errno: 11  length: -1
| >vio_blocking
| | enter: set_blocking_mode: 1  old_mode: 1
| | exit: 0
| <vio_blocking
| error: Wrong connection or packet. fd: TCP/IP (3)  len: 4294967295
| >end_server
| | info: Net: TCP/IP (3)
| | >vio_close
| | <vio_close
| | >_myfree
| | | enter: ptr: 0x835c448
| | <_myfree
| | >_myfree
| | | enter: ptr: 0x835c388
| | <_myfree
| | >net_end
| | | >_myfree
| | | | enter: ptr: 0x8360468
| | | <_myfree
| | <net_end
| | >free_old_query
| | | >init_alloc_root
| | | | enter: root: 0x835c248
| | | <init_alloc_root
| | <free_old_query
| <end_server
| >set_mysql_error
| | enter: error :2013 'Lost connection to MySQL server during query'
| <set_mysql_error
<cli_read_query_result
db_do_query failed: Lost connection to MySQL server during query [select now()]
:::: 30018 msec.
[30 Nov 2007 14:15] Magnus Blåudd
If the server is just shutdown(and thus there will be TCP packets signaling this), the error will be detected when writing the query to the server.

<mysql_real_connect
[enter] after unplugging the network cable.
>mysql_real_query
| enter: handle: 0x9a40fb0
| query: Query = 'select now()'
| >mysql_send_query
| | enter: rpl_parse: 0  rpl_pivot: 1
| <mysql_send_query
| >cli_advanced_command
| | >net_clear
| | | >vio_read_buff
| | | | enter: sd: 3  buf: 0x9a45468  size: 8192
| | | | >vio_read
| | | | | enter: sd: 3  buf: 0x9a45468  size: 8192
| | | | | exit: 0
| | | | <vio_read
| | | <vio_read_buff
| | | info: socket ready but only EOF to read - disconnected
| | <net_clear
| | >net_write_command
| | | enter: length: 12
| | <net_write_command
| | >net_flush
| | | >vio_is_blocking
| | | | exit: 1
| | | <vio_is_blocking
| | | >net_real_write
| | | <net_real_write
| | <net_flush
| | error: Can't send command to server. Error: 0
[30 Nov 2007 14:40] Magnus Blåudd
Conclusion.

1. When the server silently disappears because the network cable has been unplugged, the MySQL C API will retry the read three times. Thus causing the timeout to be multiplied by three.

This code can be seen in 'my_real_read', it appears that the comments in there indicates that the first read is always done unblocking, but that is only true when this code is compiled into the MySQL Server(and not always even there). When used in the C API it will do three blocking reads with timeout set to the value of read timeout.

The value of net->retry_count is set to 1 and it is the "for(i= 0; i < 2; i++)" that makes this happen three times. Inside the MySQL server the retry_count variable can be changed with "net_retry_count".

It might be an idea to not only document this but also _fix the problem_ so that the read is only done as many times as "net->retry_count" says. And make that variable accessible via 'mysql_option(..., MYSQL_OPT_RETRY_COUNT, ...)'

2. It looks like the write function does not have this effect. It seems to honour the net_retry_count value and only redo the write that many time.

But it is hard to test and get a nice .trace file. Can't event see how such a setup would look?
[3 May 2008 23:12] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Documented that the effective read timeout is MYSQL_OPT_READ_TIMEOUT * 3, and that the effective write timeout is MYSQL_OPT_WRITE_TIMEOUT * net_retry_count.