Bug #59412 Unkown Sending data state during test
Submitted: 11 Jan 2011 8:09 Modified: 1 Mar 2011 9:43
Reporter: yuan chaohua Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:ndb-7.1.9 OS:Linux
Assigned to: CPU Architecture:Any
Tags: sending data
Triage: Triaged: D3 (Medium) / R6 (Needs Assessment) / E6 (Needs Assessment)

[11 Jan 2011 8:09] yuan chaohua
Description:
Hi,

During our current loading test of cluster on our application ,we got lots of unknown reason sending data state in the show process list command on any sql node. 

These queries are simple join or primary key select statements and normally executed very quick ( < 0.1s ) in any sql node.  But at sometime of our loading test , these queries cost more than 3s sometimes in sending data state.

Could someone tell me why these queries sometimes cost more than 3s in sending data. Or what variables are related to sending data.

The queries:

explain partitions SELECT a.qid, b.id, b.value1, b.Value2, b.Value3, b.Value4, b.Value5, b.goal_diff, b.matchmode, a.accept_date FROM char_quest as a  INNER JOIN quest_objective as b ON a.qid = b.qid  WHERE a.finish = 0 and b.type = 83 and a.cid = 103504 \G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: p6
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 10
        Extra: Using where with pushed condition
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
   partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15
         type: ref
possible_keys: idx_qid
          key: idx_qid
      key_len: 4
          ref: sso_pc_dev_g.a.qid
         rows: 1
        Extra: Using where with pushed condition
2 rows in set (0.00 sec)

The config.ini and my.cnf are below:

# Options affecting ndbd processes on all data nodes:
[ndbd default]
NoOfReplicas=2    # Number of replicas
#DataMemory=3072M    # How much memory to allocate for data storage
#DataMemory=20480M
#DataMemory=18432M
DataMemory=4360M

IndexMemory=1024M   # How much memory to allocate for index storage
                  # For DataMemory and IndexMemory, we have used the
                  # default values. Since the "world" database takes up
                  # only about 500KB, this should be more than enough for
                  # this example Cluster setup.
StringMemory=25

ODirect=1

#MaxNoOfLocalScans=512
MaxNoOfLocalScans=80960

MaxNoOfTables=5120
MaxNoOfOrderedIndexes=5120
MaxNoOfUniqueHashIndexes=5120
MaxNoOfAttributes=12000
#MaxNoOfAttributes=24576
MaxNoOfTriggers=14336
MaxNoOfConcurrentOperations=5000000
# 1.1 * MaxNoOfConcurrentOperations
#MaxNoOfLocalOperations=5500000
#MaxAllocate=50M

LockPagesInMainMemory=1

MaxNoOfConcurrentTransactions=2048 # 10*

NoOfFragmentLogFiles=48

#### New Add ##############
DiskCheckpointSpeedInRestart=100M
FragmentLogFileSize=256M
#TimeBetweenLocalCheckpoints=20
TimeBetweenGlobalCheckpoints=4000
TimeBetweenEpochs=100
InitFragmentLogFiles=SPARSE
MemReportFrequency=30
BackupReportFrequency=10

### Watchdog 
#TimeBetweenWatchDogCheck =60000
TimeBetweenWatchdogCheckInitial=60000

### TransactionInactiveTimeout  - should be enabled in Production 
TransactionInactiveTimeout=60000
SharedGlobalMemory=20M          
LongMessageBuffer=128M 
BatchSizePerLocalScan=64
#############################

#InitFragmentLogFiles=FULL
RedoBuffer=32M
#

MaxNoOfConcurrentScans=500

#TransactionBufferMemory=10M
TimeBetweenLocalCheckpoints=25

DiskPageBufferMemory=256M
DiskCheckpointSpeed=100M

LogLevelStartup=15
LogLevelShutdown=15
LogLevelCheckpoint=8
LogLevelNodeRestart=15
LogLevelError=15

BackupWriteSize=1M
BackupDataBufferSize=16M
BackupLogBufferSize=4M
BackupMemory=20M

UndoIndexBuffer=64M
UndoDataBuffer=256M

StopOnError=0            

#NoOfDiskPagesToDiskAfterRestartTUP=40
#NoOfDiskPagesToDiskAfterRestartACC=20
#NoOfDiskPagesToDiskDuringRestartTUP=40
#NoOfDiskPagesToDiskDuringRestartACC=20

## modify at 6.28
MaxNoOfExecutionThreads=8

#TotalSendBufferMemory=20M

# New Add ##
HeartbeatIntervalDbDb=10000
HeartbeatIntervalDbApi=10000
TimeBetweenWatchDogCheck=15000
ArbitrationTimeout=15000              
TransactionDeadLockDetectionTimeOut=100000

#############################################################################################

# Management process options:
[ndb_mgmd]
id=1
hostname=10.192.140.24           # Hostname or IP address of management node
datadir=/usr/local/mysql/data  # Directory for management node log files

ArbitrationRank=1
ArbitrationDelay=0

# Options for data node "A":
[ndbd]
id=2                                # (one [ndbd] section per data node)
hostname=10.192.140.21
datadir=/usr/local/mysql/data   # Directory for this data node's data files
TotalSendBufferMemory=200M

# Options for data node "B":
[ndbd]
id=3
hostname=10.192.140.22
datadir=/usr/local/mysql/data   # Directory for this data node's data files
TotalSendBufferMemory=200M

[ndbd]
id=4                                # (one [ndbd] section per data node)
hostname=10.192.140.25
datadir=/home/mysql/data   # Directory for this data node's data files
TotalSendBufferMemory=200M

[ndbd]
id=5                                # (one [ndbd] section per data node)
hostname=10.192.140.24
datadir=/usr/local/mysql/data   # Directory for this data node's data files
TotalSendBufferMemory=200M

######################################################################################################

# SQL node options:
[MYSQLD DEFAULT]
BatchSize=512
BatchByteSize=1M
MaxScanBatchSize=16M

[mysqld]
#ArbitrationRank=0
id=15
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
#ArbitrationRank=0
id=16
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
#ArbitrationRank=0
id=17
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
#ArbitrationRank=0           
id=18
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
#ArbitrationRank=0
id=19
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=20
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=21
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=22
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=23
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=24
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
NodeId=25
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
NodeId=26
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
NodeId=27
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
NodeId=28
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
NodeId=29
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=38
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=39
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=40
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
id=41
#ArbitrationRank=0
#MaxScanBatchSize=16M
ArbitrationDelay=0

[mysqld]
#MaxScanBatchSize=16M
[mysqld]
#MaxScanBatchSize=16M
[mysqld]
#MaxScanBatchSize=16M
[mysqld]
#MaxScanBatchSize=16M
[mysqld]
#MaxScanBatchSize=16M
[mysqld]
#MaxScanBatchSize=16M

###########################################################################################

# TCP/IP options:
[tcp default]
#portnumber=1186   # This the default; however, you can use any port that is free
                  # for all the hosts in the cluster
                  # Note: It is recommended that you do not specify the port
                  # number at all and allow the default value to be used instead
SendBufferMemory=128M#204800K
ReceiveBufferMemory=128M#204800K

#[tcp]
#NodeId1=2
#NodeId2=3
#HostName1=192.168.0.12
#HostName2=192.168.0.11

Thanks for any idea.

  

How to repeat:
In our production load test.

my.cnf
[mysqld]
port = 3306
socket = /tmp/mysql.sock
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
max_allowed_packet = 160M 
init_connect='set transaction_allow_batching=1;set autocommit=0'

#log = /usr/local/mysql/log/mysqld.sql

skip-locking
skip-innodb
#skip-merge
skip-name-resolve
key_buffer_size = 256M 
query_cache_size = 4M
net_buffer_length = 40960
lower_case_table_names = 1
thread_cache_size = 256
ndb_batch_size = 0
ndb_use_transactions = 1
tmp_table_size = 256M
query_cache_type = 0
query_cache_size = 0
optimizer_search_depth =3

max_tmp_table = 512
#tmp_table_size = 512M
#max_heap_table_size=256M
#memlock
sysdate_is_now

# Add new parameter ##########
table_open_cache=1024
default-storage-engine=NDBCLUSTER
join_buffer_size=2M
read_buffer_size=2M
read_rnd_buffer_size=2M
sort_buffer_size=2M
#transaction_alloc_block_size=81920
#transaction_prealloc_size=40960
#profiling_history_size=100
#profiling=1
max_connections=2000

long_query_time = 1
log-slow-queries = /usr/local/mysql/log/slow.log
#log-queries-not-using-indexes 

log-bin=mysql-bin
binlog_format=mixed

ndb_autoincrement_prefetch_sz = 256
ndb_cluster_connection_pool = 2 

ndbcluster                      # run NDB storage engine
ndb-connectstring=10.192.140.24  # location of management server
ndb-use-exact-count=0
ndb-index-stat-enable=0
ndb-force-send=1
engine-condition-pushdown=1

[mysqldump]
max_allowed_packet = 160M

[mysqlhotcopy]
interactive-timeout
[1 Feb 2011 9:43] Martin Skold
Please send table definitions and how much data is stored. Without a
reproducable test case it is difficult to assess what the problem is.
[2 Mar 2011 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".