Bug #97777 separate global variables (from hot variables) using linker script (ELF)
Submitted: 26 Nov 2019 0:44 Modified: 29 Nov 2019 1:55
Reporter: Daniel Black (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Compiling Severity:S5 (Performance)
Version:5.7.28 OS:Linux (or any ELF platform)
Assigned to: CPU Architecture:Any

[26 Nov 2019 0:44] Daniel Black
Description:
While testing MySQL, I came across this performance problem where loading the global variables btr_search_enabled. The load of this memory location was a hot point in the code.

It turns out during compilation that btr_search_enabled was placed on a shared cache line address as ut_rnd_ulint_counter and the system was extensively using random numbers. The result is that a critical path of innodb was slow as this global variable was frequently expelled from the CPU caches and access was only granted to read it to one of the CPUs at a time.

This is a more general problem that SQL global variables are mostly read only and there is a proliferation of statistics counters and other static variables that could be innocently placed beside a C++ variable in the linker stage.

To resolve this, a similar mechanism to the linux kernel is used to use attributes to mark global variables that are read_mostly and put them in the same section. Because they are in the same section they are prevented from false sharing with other static variables.

The attached patch demonstrates this concept, however extending this to every global variable would bit rot and be susceptible to merge conflicts.

As an extension to this, global variables that are copied to session variables could also be in their own section so when the start of the global -> session occurs, they are all the same pre-fetched cache-line.

How to repeat:
TPCCRunner in READ-COMMITTED

$  conf/example/mysql/loader.properties 
driver=com.mysql.jdbc.Driver
url=jdbc:mysql://localhost/tpcc
user=root
password=cutpcctool
threads=32
warehouses=1000

$ cat conf/example/mysql/slaveXX.properties 
name=SLAVE_ID
masterAddress=127.0.0.1
masterPort=27891

driver=com.mysql.jdbc.Driver
url=jdbc:mysql://127.0.0.1/tpcc
user=tpcc
password=cu
poolSize=12

userCount=10
warehouseCount=CNT_WID
startWarehouseID=START_WID

60 slaves and some decent hardware

Suggested fix:
proof of concept patch attached.
[26 Nov 2019 0:45] Daniel Black
define linker section for global variables

Attachment: 0001-define-linker-section-for-global-variables-POC.patch (text/x-patch), 2.94 KiB.

[26 Nov 2019 13:56] Sinisa Milivojevic
Hello Mr. Black,

Thank you for your effort aimed at improving the performance of our server.

I must admit that I like VERY much the ideas that you are proposing for our server. I can not, however, immediately verify it, since I think that we need some additional informations from you.

First of all, your proof of concept is very rudimental. Can you send us one that would contain all three groups of the global variables and how would you group them. For the start, it would be enough just to provide us with the title of each group and then to send us a list of variables that would go into each one of those. We need that in order to have it implemented faster.

Second, what kind of CPUs are we discussing here. Each CPU has its own way of organising caches. I used to work on some IBM mainframes, so if you had those in mind, beside ARM64, you could cite them. Have you considered the specific organisation for each distinct type of CPU ???

Third, I understood, from your opening comment, that not all global variables would find themselves in each of the groups, that are meant for better usage of the CPU cache. Can you help us by indicating how many should go into each group. Would be nice to list them, but if that is too much for you, then giving the number and / or size would help us.

Thanks in advance.
[26 Nov 2019 13:58] Sinisa Milivojevic
HI,

As a further proof of concept, have you tried to measure the speed benefits of those changes ???
[26 Nov 2019 23:48] Daniel Black
Sinisa, thanks for your interest.

I was testing on a POWER9, two sockets, 20 cpus / socket. 4 threads per cpu.

$ numactl --hardware
available: 2 nodes (0,8)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
node 0 size: 257742 MB
node 0 free: 489 MB
node 8 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
node 8 size: 130797 MB
node 8 free: 84377 MB
node distances:
node   0   8 
  0:  10  40 
  8:  40  10 

perf report -g --no-children same run during a REPEATABLE-READ run:

+    7.70%  mysqld   mysqld               [.] btr_cur_search_to_nth_level
+    4.75%  mysqld   mysqld               [.] buf_page_get_gen
+    4.03%  mysqld   mysqld               [.] rec_get_offsets_func
+    3.90%  mysqld   mysqld               [.] MVCC::view_open
+    3.39%  mysqld   mysqld               [.] PolicyMutex<TTASEventMutex<GenericPolicy> >::enter
+    2.56%  mysqld   mysqld               [.] MYSQLparse
+    2.28%  mysqld   mysqld               [.] mtr_t::release_block_at_savepoint
+    2.07%  mysqld   mysqld               [.] pfs_rw_lock_s_lock_func
+    2.03%  mysqld   mysqld               [.] cmp_dtuple_rec_with_match_low
+    1.92%  mysqld   mysqld               [.] row_search_mvcc
+    1.91%  mysqld   mysqld               [.] page_cur_search_with_match
+    1.04%  mysqld   mysqld               [.] pfs_rw_lock_s_unlock_func
+    0.99%  mysqld   mysqld               [.] pfs_rw_lock_s_lock_func
+    0.93%  mysqld   [kernel.kallsyms]    [k] _raw_spin_lock
+    0.90%  mysqld   libc-2.26.so         [.] __memcmp_power8

The generated asm for btr_cur_search_to_nth_level is:

  0.01 │        ld     r8,3464(r31)
       │                      cursor->low_match = low_match;
  0.05 │        std    r10,96(r25)
       │                      cursor->up_bytes = up_bytes;
  0.00 │        ld     r10,3456(r31)
       │                      if (UNIV_LIKELY(btr_search_enabled) && !index->disable_ahi) {
 24.08 │        lbz    r9,0(r9)
       │                      cursor->low_bytes = low_bytes;
  0.01 │        std    r7,104(r25)
       │                      cursor->up_match = up_match;
  0.00 │        std    r8,80(r25)
       │                      cursor->up_bytes = up_bytes;
  0.01 │        std    r10,88(r25)
       │                      if (UNIV_LIKELY(btr_search_enabled) && !index->disable_ahi) {
  0.00 │        cmpwi  cr7,r9,0
  0.00 │        ld     r9,48(r29)
  0.01 │      ↓ beq    cr7,10eb88a8 <btr_cur_search_to_nth_level(dict_index_t*, 2348

There is a high contention on the load of btr_search_enabled which is odd, because as the c++ variable (for the SQL global variable adaptive_hash_index) it isn't changed and should be in the L1 cache on all of the cores being in such a hot path.

note: I probably added UNIV_LIKELY in an previous misguided attempt to solve this.

Looking at the address it was given:

$ readelf -a bin/mysqld | grep btr_search_enabled
  8522: 0000000011aa1b40     1 OBJECT  GLOBAL DEFAULT   24 btr_search_enabled
 17719: 0000000011aa1b40     1 OBJECT  GLOBAL DEFAULT   24 btr_search_enabled

Now looking at other variables within the same 256 byte area:

$ readelf -a bin/mysqld | grep 0000000011aa1b
  1312: 0000000011aa1be0   296 OBJECT  GLOBAL DEFAULT   24 fts_default_stopword
  8522: 0000000011aa1b40     1 OBJECT  GLOBAL DEFAULT   24 btr_search_enabled
  9580: 0000000011aa1b98    16 OBJECT  GLOBAL DEFAULT   24 fil_addr_null
 11434: 0000000011aa1b60     8 OBJECT  GLOBAL DEFAULT   24 zip_failure_threshold_pct
 12665: 0000000011aa1b70    40 OBJECT  GLOBAL DEFAULT   24 dot_ext
 13042: 0000000011aa1b30     8 OBJECT  GLOBAL DEFAULT   24 ut_rnd_ulint_counter
 13810: 0000000011aa1b48     8 OBJECT  GLOBAL DEFAULT   24 srv_checksum_algorithm
 18831: 0000000011aa1bb0    48 OBJECT  GLOBAL DEFAULT   24 fts_common_tables
 27713: 0000000011aa1b38     8 OBJECT  GLOBAL DEFAULT   24 btr_ahi_parts
 33183: 0000000011aa1b50     8 OBJECT  GLOBAL DEFAULT   24 zip_pad_max
  2386: 0000000011aa1b58     7 OBJECT  LOCAL  DEFAULT   24 _ZL9dict_ibfk
  5961: 0000000011aa1b68     8 OBJECT  LOCAL  DEFAULT   24 _ZL8eval_rnd
 10509: 0000000011aa1be0   296 OBJECT  GLOBAL DEFAULT   24 fts_default_stopword
 17719: 0000000011aa1b40     1 OBJECT  GLOBAL DEFAULT   24 btr_search_enabled
 18777: 0000000011aa1b98    16 OBJECT  GLOBAL DEFAULT   24 fil_addr_null
 20631: 0000000011aa1b60     8 OBJECT  GLOBAL DEFAULT   24 zip_failure_threshold_pct
 21862: 0000000011aa1b70    40 OBJECT  GLOBAL DEFAULT   24 dot_ext
 22239: 0000000011aa1b30     8 OBJECT  GLOBAL DEFAULT   24 ut_rnd_ulint_counter
 23007: 0000000011aa1b48     8 OBJECT  GLOBAL DEFAULT   24 srv_checksum_algorithm
 28028: 0000000011aa1bb0    48 OBJECT  GLOBAL DEFAULT   24 fts_common_tables
 36910: 0000000011aa1b38     8 OBJECT  GLOBAL DEFAULT   24 btr_ahi_parts
 42380: 0000000011aa1b50     8 OBJECT  GLOBAL DEFAULT   24 zip_pad_max

ut_rnd_ulint_counter is actually stored 16 bytes away. On POWER cache line size is 128 bytes however even on x86 with 64 byte cache lines there would been contention which is why I've avoided a CPU specific option.

MVCC::view_open and PolicyMutex<TTASEventMutex<GenericPolicy> >::enter, number 3 and 4 on the top functions profile:

MVCC::view_open has code:

       │     _ZN14TTASEventMutexI13GenericPolicyE17spin_and_try_lockEjjPKcj():
       │                     os_rmb;
       │       lwsync
       │                     uint32_t        n_waits = 0;
  0.02 │       std    r9,160(r31)
  0.00 │       addis  r9,r2,-98
  0.00 │       addi   r9,r9,22944
  0.00 │       std    r9,144(r31)
  0.01 │988: ↓ bne    cr4,10d8a560 <MVCC::view_open(ReadView*&, a10
       │     ↓ b      10d8a5e0 <MVCC::view_open(ReadView*&, trx_t*)+0xa90>
       │     ut_rnd_gen_ulint():
       │             ut_rnd_ulint_counter = UT_RND1 * ut_rnd_ulint_counter + UT_RND2;
  0.06 │990:   addis  r7,r2,3
  0.02 │       addi   r7,r7,-14704
 77.86 │       ld     r8,0(r7)
  0.01 │       mulld  r8,r28,r8
  0.02 │       addis  r8,r8,1828

PolicyMutex<TTASEventMutex<GenericPolicy> >::enter has code:

       │     _ZN14TTASEventMutexI13GenericPolicyE11set_waitersEv():
       │                     m_waiters = 1;
  0.00 │       li     r24,1
  0.01 │170: ↓ bne    cr4,10cbc180 <PolicyMutex<TTASEventMutex<GenericPolicy> >::enter(unsigned int, 200
       │     ↓ b      10cbc230 <PolicyMutex<TTASEventMutex<GenericPolicy> >::enter(unsigned int, unsigned int, 2b0
       │       nop
       │       ori    r2,r2,0
       │     ut_rnd_gen_ulint():
       │             ut_rnd_ulint_counter = UT_RND1 * ut_rnd_ulint_counter + UT_RND2;
  0.07 │180:   addis  r8,r2,3
  0.02 │       addi   r8,r8,-14704
 69.35 │       ld     r10,0(r8)
  0.01 │       mulld  r10,r28,r10
  0.01 │       addis  r10,r10,1828
  0.01 │       addi   r10,r10,-14435

Show the high contention in the `ld` (load) fo the ut_rnd_ulint_counter (which has been nicely fixed in 8.0 by making this a thread local).

This contention on the load is because that a few instructions later their is a store
       │             ut_rnd_ulint_counter = UT_RND1 * ut_rnd_ulint_counter + UT_RND2;
  0.05 │       std    r10,0(r8)

Note high in the cpu profile as we've already obtained the cache, but the store means the exclusive access on ut_rnd_ulint_counter (and btr_search_enabled and everything else in the address & 0x7F address range).

note: MVCC::view_open contention on trx_sys->mutex is worthy of a a separate bug  (that is coming soon after x86 tests).
[26 Nov 2019 23:49] Daniel Black
Applying this patch (seems cmake-2.8.12.2 on RHEL7 doesn't know about TARGET_LINK_OPTIONS - fairly sure there's a more portable way to write it), 

$ readelf -a sql/mysqld | grep btr_search_enabled
  1631: 000000001167e6e8     1 OBJECT  GLOBAL DEFAULT   16 btr_search_enabled
 18046: 000000001167e6e8     1 OBJECT  GLOBAL DEFAULT   16 btr_search_enabled

$ readelf -a sql/mysqld | grep 000000001167e6
  [16] .data.read_mostly PROGBITS         000000001167e6e0  0167e6e0
  1631: 000000001167e6e8     1 OBJECT  GLOBAL DEFAULT   16 btr_search_enabled
 15962: 000000001167e6e0     8 OBJECT  GLOBAL DEFAULT   16 btr_ahi_parts
    16: 000000001167e6e0     0 SECTION LOCAL  DEFAULT   16 
 18046: 000000001167e6e8     1 OBJECT  GLOBAL DEFAULT   16 btr_search_enabled
 37177: 000000001167e6e0     8 OBJECT  GLOBAL DEFAULT   16 btr_ahi_parts

$ readelf -S sql/mysqld | more
There are 40 section headers, starting at offset 0x1271c370:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         00000000100001c8  000001c8
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             00000000100001e4  000001e4
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000010000204  00000204
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .hash             HASH             0000000010000228  00000228
       0000000000041bac  0000000000000004   A       6     0     8
  [ 5] .gnu.hash         GNU_HASH         0000000010041dd8  00041dd8
       00000000000492a8  0000000000000000   A       6     0     8
  [ 6] .dynsym           DYNSYM           000000001008b080  0008b080
       00000000000ca590  0000000000000018   A       7     1     8
  [ 7] .dynstr           STRTAB           0000000010155610  00155610
       00000000001f46d8  0000000000000000   A       0     0     1
  [ 8] .gnu.version      VERSYM           0000000010349ce8  00349ce8
       0000000000010dcc  0000000000000002   A       6     0     2
  [ 9] .gnu.version_r    VERNEED          000000001035aab8  0035aab8
       0000000000000220  0000000000000000   A       7    12     8
  [10] .rela.dyn         RELA             000000001035acd8  0035acd8
       00000000000141f0  0000000000000018   A       6     0     8
  [11] .rela.plt         RELA             000000001036eec8  0036eec8
       00000000000031e0  0000000000000018  AI       6    25     8
  [12] .init             PROGBITS         00000000103720c0  003720c0
       000000000000005c  0000000000000000  AX       0     0     32
  [13] .text             PROGBITS         0000000010372120  00372120
       0000000000e3f0d0  0000000000000000  AX       0     0     32
  [14] .fini             PROGBITS         00000000111b11f0  011b11f0
       0000000000000024  0000000000000000  AX       0     0     4
  [15] .rodata           PROGBITS         00000000111b1220  011b1220
       00000000004cd4c0  0000000000000000   A       0     0     16
  [16] .data.read_mostly PROGBITS         000000001167e6e0  0167e6e0
       0000000000000020  0000000000000000  WA       0     0     8
  [17] .eh_frame_hdr     PROGBITS         000000001167e700  0167e700
       0000000000033b3c  0000000000000000   A       0     0     4
  [18] .eh_frame         PROGBITS         00000000116b223c  016b223c
       00000000001e6db0  0000000000000000   A       0     0     4
  [19] .gcc_except_table PROGBITS         0000000011898ff0  01898ff0
       00000000000410c5  0000000000000000   A       0     0     8
  [20] .init_array       INIT_ARRAY       00000000118ed438  018ed438
       0000000000000e60  0000000000000008  WA       0     0     8
  [21] .fini_array       FINI_ARRAY       00000000118ee298  018ee298
       0000000000000008  0000000000000008  WA       0     0     8
  [22] .data.rel.ro      PROGBITS         00000000118ee2a0  018ee2a0
       00000000000cf388  0000000000000000  WA       0     0     16
  [23] .dynamic          DYNAMIC          00000000119bd628  019bd628
       00000000000002d0  0000000000000010  WA       7     0     8
  [24] .got              PROGBITS         00000000119bd900  019bd900
       0000000000002610  0000000000000008  WA       0     0     256
  [25] .plt              NOBITS           00000000119c0000  019bff10
       00000000000010b0  0000000000000008  WA       0     0     8
  [26] .data             PROGBITS         00000000119c10b0  019c10b0
       00000000000a1428  0000000000000000  WA       0     0     16
  [27] .bss              NOBITS           0000000011a62500  01a624d8
       00000000000ca1a8  0000000000000000  WA       0     0     64
  [28] .comment          PROGBITS         0000000000000000  01a624d8
       0000000000000047  0000000000000001  MS       0     0     1
  [29] .debug_aranges    PROGBITS         0000000000000000  01a62520
       000000000004b9e0  0000000000000000           0     0     16
  [30] .debug_info       PROGBITS         0000000000000000  01aadf00
       00000000087e42f3  0000000000000000           0     0     1
  [31] .debug_abbrev     PROGBITS         0000000000000000  0a2921f3
       0000000000281115  0000000000000000           0     0     1
  [32] .debug_line       PROGBITS         0000000000000000  0a513308
       0000000001210da4  0000000000000000           0     0     1
  [33] .debug_str        PROGBITS         0000000000000000  0b7240ac
       00000000014cdf48  0000000000000001  MS       0     0     1
  [34] .debug_loc        PROGBITS         0000000000000000  0cbf1ff4
       00000000044c33c5  0000000000000000           0     0     1
  [35] .debug_ranges     PROGBITS         0000000000000000  110b53c0
       00000000012e3cc0  0000000000000000           0     0     16
  [36] .gnu.attributes   LOOS+0xffffff5   0000000000000000  12399080
       0000000000000010  0000000000000000           0     0     1
  [37] .symtab           SYMTAB           0000000000000000  12399090
       00000000001027b0  0000000000000018          38   9581     8
  [38] .strtab           STRTAB           0000000000000000  1249b840
       0000000000280998  0000000000000000           0     0     1
  [39] .shstrtab         STRTAB           0000000000000000  1271c1d8
       0000000000000194  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  p (processor specific)

So we end up with a .data.read_mostly section after .rodata.
[27 Nov 2019 0:18] Daniel Black
On variables that a both session and global, no special attention is needed as they are all in the `typedef struct system_variables` and therefore in adjacent memory locations. There's a small chance of contention with a hot variable placed before or after the global_system_variables but for the most part it looks safe.

$ readelf -a sql/mysqld | grep global_system_variables
 17128: 0000000011a658c8   816 OBJECT  GLOBAL DEFAULT   27 global_system_variables
 31599: 0000000011a658c8   816 OBJECT  GLOBAL DEFAULT   27 global_system_variables

$ readelf -a sql/mysqld | grep  0000000011a658
 17128: 0000000011a658c8   816 OBJECT  GLOBAL DEFAULT   27 global_system_variables
 31599: 0000000011a658c8   816 OBJECT  GLOBAL DEFAULT   27 global_system_variables

The second patch attached (extends the first) puts global variables to the same section:

$ readelf -a sql/mysqld | grep 000000001167e
  [16] .data.read_mostly PROGBITS         000000001167e6e0  0167e6e0
  [17] .eh_frame_hdr     PROGBITS         000000001167ee00  0167ee00
  GNU_EH_FRAME   0x000000000167ee00 0x000000001167ee00 0x000000001167ee00
  1631: 000000001167e6e8     1 OBJECT  GLOBAL DEFAULT   16 btr_search_enabled
 12556: 000000001167e6f0   816 OBJECT  GLOBAL DEFAULT   16 max_system_variables
 15962: 000000001167e6e0     8 OBJECT  GLOBAL DEFAULT   16 btr_ahi_parts
 17128: 000000001167ea20   816 OBJECT  GLOBAL DEFAULT   16 global_system_variables
    16: 000000001167e6e0     0 SECTION LOCAL  DEFAULT   16 
    17: 000000001167ee00     0 SECTION LOCAL  DEFAULT   17 

So from the initial patch, there just need to be a proliferation of MY_GLOBAL to the other global variables and they will get into the .data.read_mostly section. Other variables will end up in the .data section as is the default linker behaviour.
[27 Nov 2019 0:19] Daniel Black
extension to put global/session vars into the same section

Attachment: global_vars.patch (text/x-patch), 847 bytes.

[27 Nov 2019 6:20] Daniel Black
Extracting global variables:

git grep  MYSQL_SYSVAR_ | fgrep -v  .h:  | cut -f 2 -d , | grep  '[a-z]' | tee  /tmp/l1.txt | grep : |  cut -f 1 -d : | sort -u

edit displayed file list to create l2.txt (attached)

cleanup l1.txt
* strip down to variable names
* remove xpl:: 

sort /tmp/l[12].txt | sed -e 's/ //g' | uniq > /tmp/lall.txt

replace statics with assigned values:

 for a in $(< /tmp/lall.txt) ; do echo looking for $a; file=$(git grep -l -E "^[[:space:]]*(static)?[[:space:]]+[a-z*]+[[:space:]]+\\**$a[^a-zA-Z0-9_][^M][^Y][^_][^G]"  plugin/ storage/ sql |  head -n 1); [ -n "${file}" ] &&  sed -i -e "/^ *static[ \t]*[a-z]*\\**[ \t]*\\**$a[^a-zA-Z0-9_]/s:\($a[ \t]*\)=:\1MY_GLOBAL =:" $file && echo replaced in $file; git diff $file | cat; done

 for a in $(< /tmp/lall.txt) ; do echo looking for $a; file=$(git grep -E "^[[:space:]]*(static)?[[:space:]]*[a-z_*]+[[:space:]]+\\**$a[[:space:]]*=?[^a-zA-Z_][[:space:]]*[^M]?[^Y]?[^_]?[^G]?"  rapid/ plugin/ storage/ sql | egrep -v '\.(h|ic):' | fgrep -v MY_GLOBAL |  cut -f 1 -d : | head -n 1); [ -n "${file}" ] && echo attempting change on $file pattern && git grep $a $file  &&  sed -i -e "/^static[ \t]*[a-z][a-z_]*[ \t]*[^a-z_]$a[,;= \\t]/s:$a:$a MY_GLOBAL:" $file && echo replaced in $file; git diff $file | cat; done

Lots more hacking and manual editing. And check:

 for a in $(< /tmp/lall.txt) ; do  echo;  echo looking for $a; git grep "\\**$a[^a-zA-Z_]"  rapid/ plugin/ storage/ sql | grep MY_GLOBAL  ; done
[27 Nov 2019 6:20] Daniel Black
global variable list

Attachment: lall.txt (text/plain), 6.58 KiB.

[27 Nov 2019 6:21] Daniel Black
patch to make all global variables with MY_GLOBAL

Attachment: globals.patch (text/x-patch), 65.54 KiB.

[27 Nov 2019 13:15] Sinisa Milivojevic
Hello Mr. Black,

Thank you so much for your very valuable contribution.

You leave me no option, but to verify this very important set of the ideas to our Development team.

Verified as a very important contribution to enhance the performance of our server.

Thanks again.
[28 Nov 2019 13:26] Sinisa Milivojevic
Thank you Mr. Black.
[29 Nov 2019 1:55] Daniel Black
-DMUTEXTYPE=event achieved only ~1320000 tpm.

5.7.28 compiled with -DMUTEXTYPE=sys

              timestamp          tpm      avg_rt      max_rt   avg_db_rt   max_db_rt
                average  1719652.00       59.36         819       59.35         

With patch and -DMUTEXTYPE=sys

              timestamp          tpm      avg_rt      max_rt   avg_db_rt   max_db_rt
                average  1810618.80       61.51         838       61.50
[29 Nov 2019 13:51] Sinisa Milivojevic
Thank you.

I will copy your comment to our internal database.