MySQL Bugs: #48937: Memory leak from mysql_insert during partition pruning test

Bug #48937	Memory leak from mysql_insert during partition pruning test
Submitted:	20 Nov 2009 9:44	Modified:	27 Feb 2010 9:53
Reporter:	John Embretsen	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Partitions	Severity:	S3 (Non-critical)
Version:	5.6.0-beta	OS:	Linux (32-bit)
Assigned to:	Mattias Jonsson	CPU Architecture:	Any
Tags:	valgrind

Description:
Valgrind reports a possible memory leak when running the Random Query Generator (RQG) test "rqg_partn_pruning" with valgrind (Linux x86, 32-bit):

==11966== 18,912 bytes in 2 blocks are possibly lost in loss record 25 of 30
==11966==    at 0x40053C0: malloc (vg_replace_malloc.c:149)
==11966==    by 0x85A85F8: my_malloc (my_malloc.c:34)
==11966==    by 0x85A88DD: my_realloc (my_realloc.c:44)
==11966==    by 0x854DDD1: mi_alloc_rec_buff (mi_open.c:728)
==11966==    by 0x854D98F: mi_open (mi_open.c:643)
==11966==    by 0x854AFD3: ha_myisam::open(char const*, int, unsigned) (ha_myisam.cc:699)
==11966==    by 0x8378167: handler::ha_open(TABLE*, char const*, int, int) (handler.cc:2133)
==11966==    by 0x8384671: ha_partition::open(char const*, int, unsigned) (ha_partition.cc:2561)
==11966==    by 0x8378167: handler::ha_open(TABLE*, char const*, int, int) (handler.cc:2133)
==11966==    by 0x82B14DD: open_table_from_share(THD*, TABLE_SHARE*, char const*, unsigned, unsigned, unsigned, TABLE*, bool) (table.cc:1886)
==11966==    by 0x82A571C: open_unireg_entry(THD*, TABLE*, TABLE_LIST*, char const*, char*, unsigned, st_mem_root*, unsigned) (sql_base.cc:3921)
==11966==    by 0x82A7E75: open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned) (sql_base.cc:2923)
==11966==    by 0x82A8C97: open_tables(THD*, TABLE_LIST**, unsigned*, unsigned) (sql_base.cc:4588)
==11966==    by 0x82A9461: open_and_lock_tables_derived(THD*, TABLE_LIST*, bool) (sql_base.cc:4994)
==11966==    by 0x82607B7: open_and_lock_tables(THD*, TABLE_LIST*) (mysql_priv.h:1499)
==11966==    by 0x82EB087: mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (sql_insert.cc:629)
==11966==

"Posibly lost" means: 'your program is leaking memory, unless you're doing funny things with pointers.' [1]

Server: 
 Branch:   mysql-next-mr (bzr branch lp:~mysql/mysql-server/mysql-next-mr)
 Revision: alik@ibmvm-20091112031155-96uf6mnqc9js93v5

Test:
 Branch:   randgen (bzr branch lp:randgen)
 Revision: john.embretsen@sun.com-20091119154122-reqa2si83lfc4aqi

[1]: http://valgrind.org/docs/manual/faq.html#faq.deflost

How to repeat:
Assuming Linux platform with valgrind installed. 

1. Obtain code from mysql-next-mr branch. 
   If needed, obtain the revision specified in this bug's description. 
   Build. 
   Refer to the basedir of the build as environment variable NEXTMR.

2. Obtain the test framework and grammars.
   bzr branch lp:randgen
   If needed, revert to the revision specified in the description.

3. Run the test:

runall.pl \
--mysqld=--loose-innodb-lock-wait-timeout=5 \
--mysqld=--table-lock-wait-timeout=5 \
--mysqld=--skip-safemalloc \
--gendata=conf/partition_pruning.zz \
--grammar=conf/partition_pruning.yy \
--mysqld=--innodb \
--threads=1 \
--queries=100000 \
--duration=300 \
--reporters=Deadlock,ErrorLog,Backtrace,Shutdown \
--basedir=$NEXTMR \
--vardir=$NEXTMR/mysql-test/var-randgen \
--mysqld=--log-output=file \
--valgrind

or:

perl ./pb2gentest.pl $NEXTMR $NEXTMR/mysql-test/var-randgen mysql-next-mr rqg_partn_pruning_valgrind

4. Be patient (this test may take a couple of hours to run, depending on the environment).

5. Observe test- and valgrind output on std. out and in error log $NEXTMR/mysql-test/var-randgen/log/master.err

Thank you for the report.

Where is runall.pl located now? I can not find it in latest update of mysql-test-extra-6.0

runall.pl is currently in the default randgen repository on Launchpad, which you get by doing:

bzr branch lp:randgen

P.S I just ran the test again on a 64-bit linux platform and did not see this issue... I'll try to repeat.

Can not repeat on 64-bit Linux too.

I was able to repeat on 32-bit Linux only.

Build info:

CC='gcc'  CFLAGS='-g  -DSAFE_MUTEX -g -DHAVE_purify -Wall   -DUNIV_LINUX'  CXX='g++'  CXXFLAGS='-g  -DSAFE_MUTEX -g -DHAVE_purify -Wall   -fno-implicit-templates -fno-exceptions -fno-rtti'  LDFLAGS='-g -rdynamic '  ASFLAGS='-g'

./configure '--enable-thread-safe-client' '--enable-local-infile' '--with-pic' '--with-client-ldflags=-static' '--with-mysqld-ldflags=-static' '--with-zlib-dir=bundled' '--without-ndb-debug' '--with-big-tables' '--with-ssl' '--with-readline' '--with-embedded-server' '--with-archive-storage-engine' '--with-blackhole-storage-engine' '--with-csv-storage-engine' '--with-example-storage-engine' '--with-federated-storage-engine' '--with-partition' '--with-extra-charsets=all' '--with-innodb' '--with-ndbcluster' '--with-debug' '--with-libevent'

(flags and options taken from Pushbuild logs for the corresponding platform)

I couldn't repeat on Ubuntu 9.10 64-bit.

Thank you for the report.

Verified as described:

=20165== 151,296 bytes in 16 blocks are possibly lost in loss record 34 of 37
==20165==    at 0x40051F9: malloc (vg_replace_malloc.c:149)
==20165==    by 0x867BFAE: my_malloc (my_malloc.c:34)
==20165==    by 0x867C22D: my_realloc (my_realloc.c:44)
==20165==    by 0x858650C: mi_alloc_rec_buff (mi_open.c:728)
==20165==    by 0x858613B: mi_open (mi_open.c:643)
==20165==    by 0x8580248: ha_myisam::open(char const*, int, unsigned) (ha_myisam.cc:699)
==20165==    by 0x839CD11: handler::ha_open(TABLE*, char const*, int, int) (handler.cc:2133)
==20165==    by 0x83A7440: ha_partition::open(char const*, int, unsigned) (ha_partition.cc:2560)
==20165==    by 0x839CD11: handler::ha_open(TABLE*, char const*, int, int) (handler.cc:2133)
==20165==    by 0x82E9315: open_table_from_share(THD*, TABLE_SHARE*, char const*, unsigned, unsigned, unsigned, TABLE*, bool) (table.cc:1886)
==20165==    by 0x82DB45F: open_unireg_entry(THD*, TABLE*, TABLE_LIST*, char const*, char*, unsigned, st_mem_root*, unsigned) (sql_base.cc:3921)
==20165==    by 0x82D969A: open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned) (sql_base.cc:2922)
==20165==    by 0x82DC3D5: open_tables(THD*, TABLE_LIST**, unsigned*, unsigned) (sql_base.cc:4588)
==20165==    by 0x82DCD5C: open_and_lock_tables_derived(THD*, TABLE_LIST*, bool) (sql_base.cc:4994)
==20165==    by 0x829F365: open_and_lock_tables(THD*, TABLE_LIST*) (mysql_priv.h:1499)
==20165==    by 0x82999BB: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5110)
==20165== 

To repeat one should compile MySQL server with options provided.

Test partition_pruning uses LIST COLUMNS which is not applicable to 5.1

I have tested with a later mysql-next-mr-bugteam (fedora 12 32-bit, li-bing.song@sun.com-20100131133741-6jwl1n50aemwqx51), without succeeding repeating this.

John or Sveta, Could you please help me repeat this with a later mysql-next-mr (or possibly mysql-trunk) again?

I had some issues running the test to completion with valgrind (it kept timing out), but once I had that taken care of I was not able to repeat with the mysql-next-mr-bugfixing branch as of yesterday.

Setting status to "Can't repeat".

I will re-open the bug if I see it again (others can do the same), however, note that this version of the test (valgrind, 32-bit linux) is not running on an automatic basis (Pushbuild), so manual testing will be required to detect it. 

64-bit linux valgrind runs of this test are set up in Pushbuild for mysql-next-mr (weekly), and this issue has not been seen there as far as I know.