sys_test
========

Note: system_2 is intended as a replacement of system_1.
      The improvements are not direct done in system_1, because I want to
      hold system_1 in a stable state.

This suite includes a system test designed to run in an ongoing mode against
a MySQL server.

These tests include the following scenarions:
 - Simple login and logout (--> justlogin.test)
 - Insert several rows into a table
   (--> tb1_eng<n>_ins.test + tb1_eng1_ins<1-6>.test)
   - random / fixed number of rows
   - insert statement is within/not within stored procedure
   - one insert statement inserting n rows/ n insert statements inserting
     one row per statement
   - autocommit off + no COMMIT or ROLLBACK/autocommit on + COMMIT or ROLLBACK
 - Update several rows of a table
   (--> tb1_eng<n>_upd.test + tb1_eng1_upd<1-12>.test)
   - random / fixed number of rows
   - update statement is within/not within stored procedure
   - one update statement updating n rows/ n update statements updating
     one row per statement
   - (autocommit off + no COMMIT or ROLLBACK) or
     (autocommit on + COMMIT or ROLLBACK)
 - Delete several rows of a table
   (--> tb1_eng<n>_del.test + tb1_eng1_del<1-6>.test)
   - random / fixed number of rows
   - delete statement is within/not within stored procedure
   - one delete statement deleting n rows/ n delete statements deleting
     one row per statement
   - autocommit off + no COMMIT or ROLLBACK/autocommit on + COMMIT or ROLLBACK
 - Select and counts a subset of rows in a table
   (--> tb1_slct.test + tb1_eng1_sel1)
   - using a stored procedure and cursors/prepared statements  
   - cursor/10 x 100 random PRIMARY KEY values + a random PRIMARY KEY range
The above IUDS actions insert rows to a log table using triggers and a direct
insert (for the select action tb1_slct.test). The use of triggers can be
switched off (--> variable WITH_TRIGGERS).

The test is started by a one time run of an 'init' scenario that creates the
required schema and loads the initial data.

In a adition there is a 'maintenance' scenario (--> log_tests.txt) that is run often.
It
   - trims the number of rows in the log table
   - dumps the log table into files
   - collects stats about the current information in the tables
   - trims the number of rows within the table tb1_eng1 (--> tb1_eng1_mgr)

The 'init' and 'maintenance' threads run as 'root' the test scenarios are run
using a dedicated test user.

In order to run the test:
-------------------------
1) Clone the main tree and configure it to use the clone/distrib binaries
   (see seperate install instructions).
2) Clone test extra and configure it such that is linked to the main tree
   (either soft link of the 'suite' directory - *nix or physical copy - Win).
3) Set the running parameters. Edit ./run_systest<n> and set values for:
   - Location of the main clone tree
   - Location of the result/log directories
   - Duration of the test
   - Sleep time for the 'maintenance' thread
   - Number of threads (concurent users) to be used during the test
   - Storage engines for the test tables and the log table (the tests performs
     the above scenario against a set of two tables in the same database).
   - The number of row to initialy load to the tables (done by creating and
     loading a data file).
   - Number of random values (1..'n') to be used as segments in the loaded data
     and in the where clauses of the above scenarios
   - Upper and lower limits to the size of the log table for the maintenance
     thread.
   - Connection method
     FIXME: currently set for a local server using 'socket' needs to be
            modified to include a TCP connection for a remote server (this will
            require a change to the server startup method and the definition of
            the test user in the scripts)
4) From the (main clone / distribution) 'mysql-test' directory, 
   run: ./suite/system_1/run_systest1 

While test is running...
---------------------
The 'mysql-stress-test' framework does monitor for system failures (lost
connections, errors, etc.) In addition while test is running, the following
should be monitored manually:
  - The output files generated by the test ('r' directory).
  - Memory and CPU usage on the machine running the server to detect system slow
    down, changes in memory usage and other).

Common problems and some solutions(if available)
------------------------------------------------
1) The CPU load on the box during stress testing is so high that even simple
   parallel work like editing is unacceptable slow.
   Try: nice --20 ./run_systest<whatever>
2) Hang up of OS because the paging space is exhausted.
   (My box: used swap space increases with ~ 5 MB/s)
   Observations showed huge virtual address space usage of:
   1. mysqld
   2. perl mysql-stress-test.pl
   3. rare cases lt-mysqltest

   You may try my "Watchdog" Shellscript SWAP.sh, which will kill all running
   perl, mysqld and lt-mysqltest processes when the amount of free swap space
   falls under a configurable limit.
   Doing the kill of these processes by hand is in most situations no
   alternative, because the extreme paging activity will make your box so slow
   that you are unable to select a window and to type in the kill commands
   before the OS dies.
   Please be careful, because SWAP.sh kills not only processes started
   by the stress test. That means even the processes of parallel running
   tests will be killed.
3) Please be careful.
   - There must be a file with expected results for every toplevel testscript
     even if the content of the result file is not known or not deterministic.
     Even a fake result file produced by
            echo "Dummy" > r/<testname>.result
     helps a lot.
   - Tests producing non deterministic protocols are dangerous.
     There is the risk to suffer from filesystem full.
     Some hints:
     - Use --disable_query_log, --replace_column, --replace_result, --regexp
     - --error <nnnn>,<nnnn> avoids random output like PRIMARY KEY values ...
     - Use the variable $debug within scripts to generate full (non
       deterministic) output when script debugging is needed.


Some details about the testscripts
----------------------------------
     
   tb1_eng<n>_<ins|del|upd>.test
            All these tests written by Omer
            - do not use transactions + autocommit is switched on
            - pain the server with stored procedures
            - are suitable for transactional and non transactional
              storage engines

   tb1_eng1_<ins|del|upd><1-10>.test
            These tests written by Matthias
            - use transactions consisting of some auxiliary SELECT statements
              and either one INSERT/UPDATE/DELETE statement affecting n records
              or n INSERT/UPDATE/DELETE statements affcecting one record per
              statement
            - have a high probability to suffer from deadlocks, timeouts when
              waiting for locks or failing because of duplicate key
            - try to check the effects of the INSERT/UPDATE/DELETE actions
              Attention: My expectations in the result check might be wrong.
            - are suitable for transactional storage engines only
              At least the expectations within the result check do not fit
              to non transactional storage engines.

   tb1_eng1_engine<1|2>.test
            These tests written by Matthias alter the storage engine of the
            table tb1_eng1.

   tb1_eng1_sel1    (bundled into iuds_tests2.txt)
            Selects with random single PRIMARY KEYs and PRIMARY KEY ranges
            using prepared statements.
              10 times
              - 100 SELECT COUNT(*) IN (0,1) FROM systest1.tb1_eng1
                WHERE i1 = <random value>
                random value = RAND() * MAX(i1)
              - SELECT COUNT(*) >= 0 FROM systest1.tb1_eng1
                WHERE i1 BETWEEN ROUND(RAND() * <value1> AND ROUND(RAND() * <value2>
                <value1> = MAX(i1) , <value2> = MIN(i1)


Preferred configuration for transactional tests
-----------------------------------------------
ENG1=<transactional engine>
ENG1_ALT=<other transactional engine>
ENG2=<transactional engine>
ENG_LOG=<transactional engine>
PROGRAM_MONITOR='monitor.txt'
PROGRAM_CLIENT_1='iuds_tests2.txt'

Thinkable improvements of system_2
----------------------------------
- Unite IUDS_UNIT with NUM_VAL
- separate the setting of
  - the paths to tests scripts, binaries etc.
  - test scenario controlling variables (NUM_VAL, CLIENT_PROGRAM_1 etc.)

Latest observations
-------------------
  MySQL 5.1 + InnoDB used as storage engine: Many crashes at various statements
              even if triggers are switched off and IUDS scripts using stored
              procedures are omitted

Some useful commands (need GNU/UNIX tools)
------------------------------------------
  cat  /tmp/systest/res_*/error*.txt | less
  find /tmp/systest/ -type f -name "*.reject


ML 2006-06-02
