Bug #31901 read_rnd_buffer_size doesn't seem to be used as written in Reference Manual
Submitted: 29 Oct 2007 2:51 Modified: 13 Feb 2008 9:48
Reporter: Yasufumi Kinoshita Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.45 OS:Any
Assigned to: CPU Architecture:Any

[29 Oct 2007 2:51] Yasufumi Kinoshita
Description:
read_rnd_buffer_size doesn't seem to be used as written in Reference Manual

<manual>
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html

....
The original filesort algorithm works as follows:

   1.

      Read all rows according to key or by table scanning. Rows that do not match the WHERE clause are skipped.
   2.

      For each row, store a pair of values in a buffer (the sort key and the row pointer). The size of the buffer is the value of the sort_buffer_size system variable.
   3.

      When the buffer gets full, run a qsort (quicksort) on it and store the result in a temporary file. Save a pointer to the sorted block. (If all pairs fit into the sort buffer, no temporary file is created.)
   4.

      Repeat the preceding steps until all rows have been read.
   5.

      Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file.
   6.

      Repeat the following until there are fewer than MERGEBUFF2 (15) blocks left.
   7.

      On the last multi-merge, only the pointer to the row (the last part of the sort key) is written to a result file.
   8.

      Read the rows in sorted order by using the row pointers in the result file. To optimize this, we read in a big block of row pointers, sort them, and use them to read the rows in sorted order into a row buffer. The size of the buffer is the value of the read_rnd_buffer_size system variable. The code for this step is in the sql/records.cc source file.

......

The Manual says that read_rnd_buffer_size is used at step 8.

<5.0.45 source>

filesort()
 +-> find_all_keys() /* Step 1. 2. 3. 4. */
 |    +-> init_read_record()
 |    |    -> init_rr_cache()
 |    |        info->cache_records= (thd->variables.read_rnd_buffer_size / (...))
 |    +-> end_read_record()
 +-> merge_many_buff() ; merge_index() /* Step 5. 6. 7. 8.*/

But, read_rnd_buffer_size seems to be used in only step 2...

Hmm...
What should we think the read_rnd_buffer_size as?
Is it always really effective to increase the size of the read_rnd_buffer_size variable?

Regards

How to repeat:
N/A
[2 Nov 2007 7:15] Yasufumi Kinoshita
Sorry,
It's my mistake..

The main route is as following.
JOIN:exec()
	create_sort_index()
		filesort() /* Step 1. ~ 8.*/
		tab->read_first_record = join_init_read_record
	do_select()
		sub_select()
			(*join_tab->read_first_record)()
				init_read_record() /* init read_rnd_buffer */

--------------------------

But, I still wonder the sorting behavior.

If we have enough sort_buffer_size to sort data without merging, the read_rnd_buffer is never used.

So, in some cases, the performance without merging is worse than with merging.

Regards
[11 Dec 2007 11:00] Susanne Ebrecht
Please, can you tell us, where do you find this in our source code?
Looking through the code from MySQL Server 5.0.45, the code is nowhere.
[12 Jan 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[31 Jan 2008 12:23] Susanne Ebrecht
Yasufumi Kinoshita,

we still need to know, from which file/version of our source is the pasted code?
[13 Feb 2008 1:25] Yasufumi Kinoshita
Sorry for my late reply.

====(sql/records.cc: init_read_record())================
...
  if (select && my_b_inited(&select->file))
    tempfile= &select->file;
  else
    tempfile= table->sort.io_cache;
  if (tempfile && my_b_inited(tempfile)) // Test if ref-records was used
  {
...
      if (! init_rr_cache(thd, info))
...
========================================================

If [data need sort] < [sort_buffer_size], [tempfile= table->sort.io_cache ==0] at here.
So, it never calls init_rr_cache(), and never uses read_rnd_buffer.

Thus, if we uses read_rnd_buffer for sorting and we sets abundant sort_buffer, read_rnd_buffer comes not to be used and some sorting may become slow.
[13 Feb 2008 9:48] Susanne Ebrecht
Many thanks for writing a bug report.

The referred code starts at line 104 at sql/record.cc.
[13 Feb 2008 10:11] Konstantin Osipov
Changing the category.