MySQL Bugs: #43469: Auto Inc table instert scalability issues when ndb-cluster-connection-pool

Bug #43469	Auto Inc table instert scalability issues when ndb-cluster-connection-pool > 1
Submitted:	6 Mar 2009 23:30	Modified:	22 Apr 2009 15:02
Reporter:	Jonathan Miller	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S5 (Performance)
Version:	mysql-5.1-telco-6.2, 6.3,6.4	OS:	Linux
Assigned to:		CPU Architecture:	Any

Description:
ACRT shows an issue scaling with version mysql-5.1-telco-6.2, 6.2 and 6.3 when ndb-cluster-connection-pool setting > 1;

This same issue is not being seen in mysql-6.0-telco-6.* using the same test hosts, configurations and test settings.

Listed below are different results sets showing that as more users are added, the performance stays flat and does not scale up.

Please Note:
1) Processes are without think time (time between transactions) and there for each process counts about 10X. (i.e. 4 threads/process == 40 users)
2) Results measurements are reported in TPS (Transaction Per Second)

mysql-5.1-telco-6.2: (ndb-cluster-connection-pool > 1)

Results Date: 2009-02-25 16:28:38
Revision ID:  frazer@mysql.com-20090225124705-xehlu964zi46awak
Revision NO:  2859
Rev Date:      Wed 2009-02-25 12

Thread(s)           Value(s)
----------------------------
4         210.7799987793
8         208.05333328247
12         322.34333229065
16         347.639503479
20         348.52666854858
24         347.82000160217

mysql-5.1-telco-6.2: (ndb-cluster-connection-pool = 1)

esults Date: 2009-03-03 04:09:11
Revision ID:  jonas@mysql.com-20090302093957-rq6tkosypomfd1nw
Revision NO:  2865
Rev Date:      Mon 2009-03-02 10

Thread(s)           Value(s)
----------------------------
4         632.60200500488
8         1047.2266387939
12         1173.1438369751
16         1207.563331604
20         1208.7987976074
24         1166.6433448792

mysql-5.1-telco-6.3: (ndb-cluster-connection-pool > 1)

Results Date: 2009-02-27 05:10:33
Revision ID:  tomas.ulin@sun.com-20090226093700-b1ip3kyt6a9p52m8
Revision NO:  2888
Rev Date:      Thu 2009-02-26 10

Thread(s)           Value(s)
----------------------------
4         211.34999847412
8         197.34000205994
12         280.44666671753
16         285.90029335022
20         284.10000133514
24         286.8805437088

mysql-5.1-telco-6.3: (ndb-cluster-connection-pool = 1)

Previous Date: 2009-02-26 17:21:21
Revision ID:   tomas.ulin@sun.com-20090226093700-b1ip3kyt6a9p52m8

Thread(s)           Value(s)
----------------------------
4         706.51333618164
8         1154.5299987793
12         1300.5600204468
16         1353.3166732788
20         1346.3800048828
24         1318.5441703796

mysql-5.1-telco-6.4: (ndb-cluster-connection-pool = 4 )

Results Date: 2009-02-28 01:49:47
Revision ID:  jonas@mysql.com-20090227200133-x3xcsi3sgbhsc073
Revision NO:  2899
Rev Date:      Fri 2009-02-27 21

Thread(s)           Value(s)
----------------------------
4         202.85333633423
8         200.13333511353
12         265.94666481018
16         276.55333137512
20         276.59096050262
24         282.05245113373

mysql-5.1-telco-6.4: (ndb-cluster-connection-pool = 2 )

Results Date: 2009-02-28 13:06:11
Revision ID:  jonas@mysql.com-20090227200133-x3xcsi3sgbhsc073
Revision NO:  2899
Rev Date:      Fri 2009-02-27 21

Thread(s)           Value(s)
----------------------------
4         142.156665802
8         366.62666320801
12         520.02666473389
16         503.63999938965
20         499.79478645325
24         499.65634727478

mysql-5.1-telco-6.4: (ndb-cluster-connection-pool = 1 )

Results Date: 2009-02-28 19:11:07
Revision ID:  jonas@mysql.com-20090227200133-x3xcsi3sgbhsc073
Revision NO:  2899
Rev Date:      Fri 2009-02-27 21

Thread(s)           Value(s)
----------------------------
4         702.48999023438
8         1135.4100036621
12         1275.4978637695
16         1291.4766769409
20         1280.6209220886
24         1254.7166213989

mysql-6.0-telco-6.4: (ndb-cluster-connection-pool = 4 ) {2 Data Node}

Results Date: 2009-03-05 03:07:43
Revision ID:  jonathan.miller@sun.com-20090304161428-n3lgvsvjw7e2nr3p
Revision NO:  2848
Rev Date:      Wed 2009-03-04 17

Thread(s)           Value(s)
----------------------------
4         1084.8000183105469
8         1844.42333984375
12         2039.7266845703125
16         1887.479118347168
20         1839.1149978637695
24         1728.4747009277344

mysql-6.0-telco-6.4: (ndb-cluster-connection-pool = 4 ) {4 Data Node}

Results Date: 2009-03-06 08:18:15
Revision ID:  jonathan.miller@sun.com-20090304161428-n3lgvsvjw7e2nr3p
Revision NO:  2848
Rev Date:      Wed 2009-03-04 17

Thread(s)           Value(s)
----------------------------
4         987.4866638183594
8         1422.0279083251953
12         2173.6399841308594
16         2655.9200897216797
20         2829.800033569336
24         2758.749984741211

How to repeat:
ACRT/TPC-B/2-dn-perf.cnf

Suggested fix:
Unknown

s/ ACRT shows an issue scaling with version mysql-5.1-telco-6.2, 6.2 and 6.3 when
ndb-cluster-connection-pool setting > 1; / ACRT shows an issue scaling with version mysql-5.1-telco-6.2, "6.3" and "6.4" when ndb-cluster-connection-pool setting > 1;

Hi,

I have spent the last 2 day figuring out how I might narrow this down.

To try and narrow this issue down, I installed valgrind and callgrind with KCachegrind, rewrote the boot.sh to allow the use of a valgrind 64 bit max build (AMD and PENT) and added ability to do iterations to tpcbSQL .

Notes:

I have found that you don't need multiple connection/clients to see the scalability issues. You can see it with just one.

{one client run done for 2 seconds)

pool = 3
Total Test Time       2 seconds
Total Transaction Count =       6
Total Rollback Count =       0
Total Transaction Per Second (TPS) = 3.000000

pool = 1
Total Test Time       2 seconds
Total Transaction Count =      11
Total Rollback Count =       0
Total Transaction Per Second (TPS) = 5.500000

Function Call difference:

To ensure that I was comparing apples to apples, I instrumented tpcbSQL to allow iterations. I then reran testing using 5 iterations for each collection of profile data.

What sticks out like a sore thumb is the ndb_mgm_configuration_iterator.

pool = 1 (calls)
ndb_mgm_configuration_iterator::find   = 256
ndb_mgm_configuration_iterator::next  = 1531
ndb_mgm_configuration_iterator::enter = 1585

pool = 3 (calls)
ndb_mgm_configuration_iterator::find   = 768       (+300%)
ndb_mgm_configuration_iterator::next  = 4614     ( +301%)
ndb_mgm_configuration_iterator::enter = 4776     ( +301%)

I find it interesting that setting the pool to 3 cause 300% more in these function calls

[Disclaimer: I don't know enough about the code to yet understand why this happens]

So I believe this is where the problems lye, yet I don't know why we are doing all the extra calls.

Ideas?????

feedback, questions are encouraged.

-- 
Best wishes,
/Jeb

Workaround.

There are 2 workarounds for this issue.

1) ndb-connection-pool is set to 1. Setting the option to 1 keeps the scalability.

2) if you are needing additional scalability, you can set ndb-connection-pool higher keeping the following in mind.

The # of NDB API slots = = the # ndb-connection-pool && the number of clients/threads connection to MySQLD is equal to or less then the # ndb-connection pool.

Attached it the bare minimum (thus far) you need to see the issues.

When you run the mysqlslap command below, just by changing the -c you will see a major difference in the way MySQLD and NDBD demons are using the system.

1) Build and install latest 7.0 ( I use the BUILD/compile-ndb-autotest script.)
2) Setup cluster similar to the attached config.ini
3) Start MySQLD similar to attached my.cnf
4) Install the tpcb database, script attached (note: I cut out all but one select and one update out of TPCB.ExTrans)
/usr/bin/perl -w ./tpcb_load_db.pl --port=15001 --host=ndb13 --log=/space/cluster_rep_auto/logs/tpcb.log
--numA=100000

5) Call the test.sql (TPCB.ExTrans()) using mysqlslap and just change -c between 2 and 3 to see difference.

This calls a SP that starts trans, selects balance, add $3 to balance, and updates row, then commits. Table has PK. (No auto_inc)

./mysqlslap -c 2 --debug-check  -h NDB13 -P 15001 --query=./test.sql --socket=/tmp/mysql.sock --create-schema=TPCB -i 20000 Benchmark
./mysqlslap -c 3 --debug-check  -h NDB13 -P 15001 --query=./test.sql --socket=/tmp/mysql.sock --create-schema=TPCB -i 20000 Benchmark

min my.cnf

Attachment: my.cnf (application/octet-stream, text), 569 bytes.

load script with modified SP

Attachment: tpcb_load_db.pl (application/x-perl, text), 32.08 KiB.

ndb config

Attachment: config.ini (application/octet-stream, text), 338 bytes.

text file to call SP

Attachment: test.sql (application/octet-stream, text), 30 bytes.

So it turns out that the scalability problems is with the auto increment.

Issue:
* Table has auto inc
* {NCP} ndb-connection-pool > 1
* Number of clients inserting into the table > NCP

but it is not seen if:

1) NCP = 1

or

2) NCP > 1 &&  = = API SLOTS && >=  # of clients using MySQLD

The results below show the above. The recent run is that of #2 and has all set to a max of 24. The previous run was done with a setting of #1.

As you can see, the extra slots and MySQLD connection does provide better scalability. 

----------------------------------------------------------------
                   Performance Report
----------------------------------------------------------------
----------------------------------------------------------------
   Report for TPC-B run started on 2009-04-22 15:54:04
----------------------------------------------------------------
Clone used was mysql-5.1-telco-7.0 and config used was conf-2-dn-perf-6.4.cnf
----------------------------------------------------------------
Results measurements are reported in TPS
----------------------------------------------------------------
Results Date: 2009-04-22 15:54:04
Revision ID:  build@mysql.com-20090422131237-vo0kbk7kzwz17d95
Revision NO:  2876
Rev Date:      Wed 2009-04-22 15

Thread(s)           Value(s)
----------------------------
4         940.3466796875
8         1425.5633239746
12         1643.7499847412
16         1671.7252731323
20         1674.6633224487
24         1670.5512924194

Previous Date: 2009-04-18 19:3
9:25
Revision ID:   pekka@mysql.com-20090417190212-yifsmutw0fef59qc

Thread(s)           Value(s)
----------------------------
4         695.2407989502
8         1133.3599700928
12         1287.9699935913
16         1299.6866531372
20         1290.1900100708
24         1294.7066650391

Deviation percentage(s) for run.

Threads      %
-------   ------
4         35.2548183460184
8         25.7820429159744
12         27.623313657942
16         28.6252551026102
20         29.7997434003385
24         29.0293266829633

Add new test case files for use in repeating this issue. Please use these instead of the ones attached before.

1) simple cluster, no thrills other then 2 or 3 [API] connections
2) simple mysqld with ndb-connection-pool option = 1 (to start with)
3) copy all attached files to /tmp (or where ever seem best for you)

4) load db

Port Style
/usr/bin/perl -w ./tpcb_load_db.pl --port=15001 --host=ndb13 --log=/tmp/load.log  --numA=100000

Socket Style
/usr/bin/perl -w ./tpcb_load_db.pl --host=ndb13  -sock  --spath=/tmp/mysql.sock --log=/space/cluster_rep_auto/logs/tpcb.log --numA=100000

5) run tests

sh -x  ./loop.sh

Note: you should edit loop.sh with the settings for your host. If you want to use a socket, remove the port and add (--sock --sopath=(your path))

6)   Review /tmp/log.out (you might put some type of separating comment in the file.)

7) change connection pool to 2 and restart the mysqld

8) run test

./loop.sh

9) review results

New loader

Attachment: tpcb_load_db.pl (application/x-perl, text), 31.77 KiB.

New driver

Attachment: loop.sh (application/octet-stream, text), 311 bytes.

New test

Attachment: tpcb.pl (application/x-perl, text), 6.35 KiB.

This bug is not scheduled to be fixed at this time.