Bug #31430 Cacheability of subqueries is unclear
Submitted: 6 Oct 2007 17:27 Modified: 14 May 2012 17:31
Reporter: Baron Schwartz (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: qc

[6 Oct 2007 17:27] Baron Schwartz
Description:
According to http://dev.mysql.com/doc/refman/5.0/en/explain.html, 

"Cacheability of subqueries is subject to the restrictions detailed in Section 6.5.4.1, “How the Query Cache Operates”. For example, referring to user variables makes a subquery uncacheable."

This makes it sound like subqueries use the MySQL query cache, which they don't.  But they do use some kind of cache, which you can see by EXPLAIN EXTENDED.

They are also not subject to all of the same restrictions, as I show in the 'how to repeat' section.

How to repeat:
According the the restrictions on the query cache, the following shouldn't be cacheable, so the second row should say UNCACHEABLE SUBQUERY:

mysql> explain select film_id, (select now() from sakila.film limit 1) from sakila.film limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: film
         type: index
possible_keys: NULL
          key: idx_fk_language_id
      key_len: 1
          ref: NULL
         rows: 951
        Extra: Using index
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: film
         type: index
possible_keys: NULL
          key: idx_fk_language_id
      key_len: 1
          ref: NULL
         rows: 951
        Extra: Using index
2 rows in set (0.00 sec)

There is some kind of cache, but it's not the query cache.  You can see it with EXPLAIN EXTENDED:

mysql> explain extended select film_id from sakila.film where film_id in(select film_id from sakila.film_actor);
... omitted ...

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `sakila`.`film`.`film_id` AS `film_id` from `sakila`.`film` where <in_optimizer>(`sakila`.`film`.`film_id`,<exists>(<index_lookup>(<cache>(`sakila`.`film`.`film_id`) in film_actor on idx_fk_film_id)))
1 row in set (0.00 sec)

Notice the <cache>(...) in the output.  What is this cache, and how does it work?

Suggested fix:
I would like to know what makes a subquery uncacheable, what kind of cache subqueries use, and how it works.
[6 Oct 2007 21:14] Peter Laursen
This example is not a subquery in the SQL meaning of it, I think.
Call it a 'nested' query ...

refer to: http://dev.mysql.com/doc/refman/5.0/en/subqueries.html

Anyway .. all sorts of restrictions with the Query cache and 'nested queries', subqueries and subselects (WHERE .. IN) should be documented - or fixed!
[7 Oct 2007 13:54] Baron Schwartz
Just a quick additional comment, I got a chance to look at the source for 5.0.45 today, and found this:

./libmysqld/item.cc:6165:  str->append(STRING_WITH_LEN("<cache>("));
./sql/item.cc:6165:  str->append(STRING_WITH_LEN("<cache>("));

In both cases, the source says

void Item_cache::print(String *str)
{
  str->append(STRING_WITH_LEN("<cache>("));
  if (example)
    example->print(str);
  else
    Item::print(str);
  str->append(')');
}

Item_cache isn't documented that I can see, but it looks like a local, temporary cache a single query can use to optimize repeatedly evaluating an Item (in the MySQL internal sense) with the same parameters.  There are subclasses for integers, rows, etc.  There's a mention of this on http://forge.mysql.com/wiki/MySQL_Internals_Algorithms#The_Item_Class, but no documentation.

But my understanding may be way off on this.
[8 Oct 2007 22:21] MySQL Verification Team
Thank you for the bug report.

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `film`.`film_id` AS `film_id` from `sakila`.`film` where <in_optimizer>(`film`.`film_id`,<exists>(<index_lookup>(<cache>(`film`.`
film_id`) in film_actor on idx_fk_film_id)))
1 row in set (0.00 sec)

mysql> show variables like "%version%";
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| protocol_version        | 10                  |
| version                 | 5.0.52-nt           |
| version_comment         | Source distribution |
| version_compile_machine | unknown             |
| version_compile_os      | Win64               |
+-------------------------+---------------------+
5 rows in set (0.00 sec)

mysql>
[14 May 2012 17:31] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

It was incorrect for http://dev.mysql.com/doc/refman/5.5/en/explain-output.html to imply some sort of relationship between subquery caching and the query cache. The page now says:

"
Cacheability of subqueries differs from caching of query results in the query cache (which is described in Section 8.9.3.1, “How the Query Cache Operates”). Subquery caching occurs during query execution, whereas the query cache is used to store results only after query execution finishes.
"

<cache> in EXPLAIN EXTENDED output indicates when a value during query execution is cached in memory. See:
http://dev.mysql.com/doc/refman/5.5/en/explain-extended.html