MySQL Bugs: #86705: Memory leak of Innodb

Bug #86705	Memory leak of Innodb
Submitted:	15 Jun 2017 2:25	Modified:	22 Jun 2017 12:00
Reporter:	qinglin zhang (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S2 (Serious)
Version:	5.5.45, 5.5.56	OS:	Linux (MySQL OOM because of memory leak)
Assigned to:		CPU Architecture:	Any
Tags:	Leak, memory leak

Description:
One of our MySQL instance OOM because of Memroy leak caused by btr_cur_update_in_place, We use valgrind to run mysqld and find out the causer.

1. valgrind --tool=massif --massif-out-file=/data/memory_analyze/massif.out ...

2. ms_print massif.out  | less

98.64% (378,084,075B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->62.58% (239,864,656B) 0x902251: mem_heap_create_block (mem0mem.c:333)
| ->43.35% (166,178,880B) 0x925A63: rec_get_offsets_func (mem0mem.ic:439)
| | ->43.35% (166,178,880B) 0x87D498: btr_cur_update_in_place (btr0cur.c:1873)
| | | ->43.35% (166,178,880B) 0x87E176: btr_cur_optimistic_update (btr0cur.c:2036)
| | |   ->21.68% (83,089,440B) 0x83F143: row_upd_clust_rec (row0upd.c:1949)
| | |   | ->21.68% (83,088,320B) 0x842DBB: row_upd_clust_step (row0upd.c:2225)
| | |   | | ->21.68% (83,088,320B) 0x84362C: row_upd_step (row0upd.c:2306)
| | |   | |   ->21.68% (83,088,320B) 0x82DCF7: row_update_for_mysql (row0mysql.c:1487)
| | |   | |     ->21.68% (83,088,320B) 0x81AEDE: ha_innobase::update_row(unsigned char const*, unsigned char*) (ha_innodb.cc:5552)
| | |   | |       ->21.68% (83,088,320B) 0x6864EC: handler::ha_update_row(unsigned char const*, unsigned char*) (handler.cc:5233)
| | |   | |         ->21.68% (83,088,320B) 0x604604: mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bool, unsigned long long*, unsigned long long*) (sql_update.cc:728)
| | |   | |           ->21.68% (83,088,320B) 0x590AB8: mysql_execute_command(THD*) (sql_parse.cc:2824)
| | |   | |             ->21.68% (83,088,320B) 0x595276: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5791)
| | |   | |               ->21.68% (83,088,320B) 0x71E895: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3476)
| | |   | |                 ->21.68% (83,088,320B) 0x53168A: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1164)
| | |   | |                   ->21.68% (83,088,320B) 0x53379A: handle_slave_sql (slave.cc:2579)
| | |   | |                     ->21.68% (83,088,320B) 0x4E32DC3: start_thread (in /usr/lib64/libpthread-2.17.so)
| | |   | |                       ->21.68% (83,088,320B) 0x5D8574B: clone (in /usr/lib64/libc-2.17.so)

3. valgrind --log-file=/data/musazhang/memory_analyze/memory_leak.out --leak-check=full --show-leak-kinds=all --show-reachable=yes 

4. cat memory_leak.out | less

==36261== 193,760 bytes in 173 blocks are possibly lost in loss record 10 of 12
==36261==    at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
==36261==    by 0x902252: mem_heap_create_block (mem0mem.c:333)
==36261==    by 0x925A64: mem_heap_create_func (mem0mem.ic:439)
==36261==    by 0x925A64: rec_get_offsets_func (rem0rec.c:585)
==36261==    by 0x87D499: btr_cur_update_in_place (btr0cur.c:1873)
==36261==    by 0x87E177: btr_cur_optimistic_update (btr0cur.c:2036)
==36261==    by 0x883637: btr_cur_pessimistic_update (btr0cur.c:2289)
==36261==    by 0x83F221: row_upd_clust_rec (row0upd.c:1981)
==36261==    by 0x842DBC: row_upd_clust_step (row0upd.c:2225)
==36261==    by 0x84362D: row_upd (row0upd.c:2306)
==36261==    by 0x84362D: row_upd_step (row0upd.c:2446)
==36261==    by 0x82DCF8: row_update_for_mysql (row0mysql.c:1487)
==36261==    by 0x81AEDF: ha_innobase::update_row(unsigned char const*, unsigned char*) (ha_innodb.cc:5552)
==36261==    by 0x6864ED: handler::ha_update_row(unsigned char const*, unsigned char*) (handler.cc:5233)
==36261== 
==36261== 777,652,960 bytes in 694,333 blocks are definitely lost in loss record 11 of 12
==36261==    at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
==36261==    by 0x902252: mem_heap_create_block (mem0mem.c:333)
==36261==    by 0x925A64: mem_heap_create_func (mem0mem.ic:439)
==36261==    by 0x925A64: rec_get_offsets_func (rem0rec.c:585)
==36261==    by 0x87D499: btr_cur_update_in_place (btr0cur.c:1873)
==36261==    by 0x87E177: btr_cur_optimistic_update (btr0cur.c:2036)
==36261==    by 0x883637: btr_cur_pessimistic_update (btr0cur.c:2289)
==36261==    by 0x83F221: row_upd_clust_rec (row0upd.c:1981)
==36261==    by 0x842DBC: row_upd_clust_step (row0upd.c:2225)
==36261==    by 0x84362D: row_upd (row0upd.c:2306)
==36261==    by 0x84362D: row_upd_step (row0upd.c:2446)
==36261==    by 0x82DCF8: row_update_for_mysql (row0mysql.c:1487)
==36261==    by 0x81AEDF: ha_innobase::update_row(unsigned char const*, unsigned char*) (ha_innodb.cc:5552)
==36261==    by 0x6864ED: handler::ha_update_row(unsigned char const*, unsigned char*) (handler.cc:5233)
==36261== 
==36261== 777,667,520 bytes in 694,346 blocks are definitely lost in loss record 12 of 12
==36261==    at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
==36261==    by 0x902252: mem_heap_create_block (mem0mem.c:333)
==36261==    by 0x925A64: mem_heap_create_func (mem0mem.ic:439)
==36261==    by 0x925A64: rec_get_offsets_func (rem0rec.c:585)
==36261==    by 0x87D499: btr_cur_update_in_place (btr0cur.c:1873)
==36261==    by 0x87E177: btr_cur_optimistic_update (btr0cur.c:2036)
==36261==    by 0x83F144: row_upd_clust_rec (row0upd.c:1949)
==36261==    by 0x842DBC: row_upd_clust_step (row0upd.c:2225)
==36261==    by 0x84362D: row_upd (row0upd.c:2306)
==36261==    by 0x84362D: row_upd_step (row0upd.c:2446)
==36261==    by 0x82DCF8: row_update_for_mysql (row0mysql.c:1487)
==36261==    by 0x81AEDF: ha_innobase::update_row(unsigned char const*, unsigned char*) (ha_innodb.cc:5552)
==36261==    by 0x6864ED: handler::ha_update_row(unsigned char const*, unsigned char*) (handler.cc:5233)
==36261==    by 0x604605: mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bool, unsigned long long*, unsigned long long*) (sql_update.cc:728)
==36261== 
==36261== LEAK SUMMARY:
==36261==    definitely lost: 1,555,360,800 bytes in 1,388,715 blocks
==36261==    indirectly lost: 0 bytes in 0 blocks
==36261==      possibly lost: 375,776 bytes in 336 blocks
==36261==    still reachable: 2,614 bytes in 22 blocks
==36261==         suppressed: 0 bytes in 0 blocks

How to repeat:
Code Analyze and run mysqld with valgrind

Analyze:

btr_cur_update_in_place
{

...

1872   trx = thr_get_trx(thr);
1873   offsets = rec_get_offsets(rec, index, offsets, ULINT_UNDEFINED, &heap);
1874 #ifdef UNIV_DEBUG
1875   if (btr_cur_print_record_ops && thr) {
1876     btr_cur_trx_report(trx, index, "update ");
1877     rec_print_new(stderr, rec, offsets);
1878   }
1879 #endif /* UNIV_DEBUG */
1880   
1881   block = btr_cur_get_block(cursor);
1882   page_zip = buf_block_get_page_zip(block);
1883   
1884   /* Check that enough space is available on the compressed page. */
1885   if (page_zip
1886       && !btr_cur_update_alloc_zip(page_zip, block, index,
1887            rec_offs_size(offsets), FALSE, mtr)) {
1888     return(DB_ZIP_OVERFLOW);
1889   }
1890   
1891   /* Do lock checking and undo logging */
1892   err = btr_cur_upd_lock_and_undo(flags, cursor, update, cmpl_info,
1893           thr, mtr, &roll_ptr);
1894   if (UNIV_UNLIKELY(err != DB_SUCCESS)) {
1895     
1896     if (UNIV_LIKELY_NULL(heap)) {
1897       mem_heap_free(heap);
1898     }
1899     return(err);
1900   }

...

}

once btr_cur_update_alloc_zip failed, memory allocated by rec_get_offsets will not be freed until mysqld die away.

Hello Zhang,

Thank you for the report and feedback.
Could you please provide more details to reproduce this issue at my end(starting with conf file, would be nice if you can share a test script to trigger the issue). So far, I tried below things at my end with:

rm -rf 86705
scripts/mysql_install_db --basedir=$PWD --datadir=$PWD/86705
/export/umesh/server/binaries/valgrind-3.13/bin/valgrind --tool=massif --tool=memcheck bin/mysqld --basedir=$PWD --datadir=$PWD/86705 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3306 --log-error=$PWD/86705/log.err  --innodb_file_format=Barracuda --innodb_file_per_table=1

and provisioned some data using sysbench(and, yes altered table to include row_format=compressed):

sysbench/sysbench --test=sysbench/tests/db/oltp.lua --oltp-table-size=1000000  --mysql-table-engine=innodb --mysql-db=test --mysql-user=root --mysql-socket=/tmp/mysql_ushastry.sock  prepare
 sysbench/sysbench --test=sysbench/tests/db/oltp.lua --oltp-test-mode=complex --oltp-read-only=off --num-threads=50 --max-requests=0 --report-interval=2  --mysql-db=test --mysql-user=root --mysql-socket=/tmp/mysql_ushastry.sock run

After running it for a while observed that valgrind didn't complain of anything. 

regards,
Umesh

Matter of fact, we can find the problem by analyzing the function named btr_cur_update_in_place.

1) rec_get_offsets succeed and allocate heap for offsets.
2) We can see the problems that when btr_cur_update_alloc_zip failed, we don't free memory allocated in 1)

1885   if (page_zip
1886       && !btr_cur_update_alloc_zip(page_zip, block, index,
1887            rec_offs_size(offsets), FALSE, mtr)) {
1888     return(DB_ZIP_OVERFLOW);
1889   }

By the way our table is compressed table.

Thank you for the feedback.
Verifying based on my internal discussion.
Only 5.5.x affected.

regards,
umesh

TXSQL at CDB for Tencent Cloud has fixed this problem, the patch lie bellow:

1884   /* Check that enough space is available on the compressed page. */
1885   if (page_zip                                                                                                                                                    
1886       && !btr_cur_update_alloc_zip(page_zip, block, index,
1887            rec_offs_size(offsets), FALSE, mtr)) {
1888         if (UNIV_LIKELY_NULL(heap)) {
1889           mem_heap_free(heap);
1890       }
1888     return(DB_ZIP_OVERFLOW);
1889   }