Bug #28812 rpl_ssl fails due to assert in extra/yassl/src/socket_wrapper.cpp:117
Submitted: 31 May 2007 17:05 Modified: 14 Sep 2007 21:00
Reporter: Andrei Elkin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Tests Severity:S3 (Non-critical)
Version:<= 5.0.44, 5.1.20 OS:MacOS
Assigned to: Magnus Blåudd CPU Architecture:Any

[31 May 2007 17:05] Andrei Elkin
Description:
The failure reported by pb typically as

rpl_ssl                        [ fail ]

Errors are (from /Users/henry/pb/mysql-5.1-new-ndb/463/mysql-5.1.19-beta-pb463/mysql-test/var-n_mix-5/log/mysqltest-time) :
mysqltest: At line 53: query 'stop slave' failed: 2013: Lost connection to MySQL server during query
(the last lines may be the most important ones)

appears to be cause by the assert.

PB registers this failure also on vm-win2003-64-b

How to repeat:
0. ssh powermacg5
1. to get a built or sources and build
2. modify rpl_ssl to make longer looping in the test:

# with 500 failing can be every other run
let $i= 500;

while ($i)
{
  start slave;
  connection master;
  ...

3a. running (maybe serveral times) the test in usual way
   ./mysql-test-run.pl rpl_ssl

   eventually fails as described above and slave.err will
   have
070531 19:07:06 - mysqld got signal 11;
   which designates the assert

3b. To prove that it's indeed the claimed file and line I only can suggest
to execute 
    mysql-test-run.pl --extern rpl_ssl

provided that master and slave launched, moreover slave in gdb.
One has to do the following steps:
   3b_1. modify mysql-test-run.pl to allow rpl tests with --extern 
         For that search for 
          if ( $opt_extern )
          {
            # Turn off features not supported when running with extern server
            #### >>>> comment-out $opt_skip_rpl= 1;
   3b_2. start master from shell and server with gdb
    (i had to use ps ax -w -w | grep mysqld > file; and edit it).

master.run:

mysqld --no-defaults --console --basedir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test --character-sets-dir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/sql/share/charsets --secure-file-priv=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var --log-bin-trust-function-creators --default-character-set=latin1 --language=/private/tmp/andrei/mysql-5.1.19-beta-pb463/sql/share/english --tmpdir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/tmp --pid-file=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/run/master.pid --port=12000 --socket=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/tmp/master.sock --datadir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/master-data --log-output=table,file --log=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/master.log --log-slow-queries=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/master-slow.log --log-bin=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/master-bin --server-id=1 --loose-innodb_data_file_path=ibdata1:10M:autoextend --local-infile --loose-skip-innodb --loose-skip-ndbcluster --plugin_dir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/storage/example/.libs --key_buffer_size=1M --sort_buffer=256K --max_heap_table_size=1M --ssl-ca=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/cacert.pem --ssl-cert=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/server-cert.pem --ssl-key=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/server-key.pem --core-file --open-files-limit=1024

slave.run

gdb -args /private/tmp/andrei/mysql-5.1.19-beta-pb463/sql/mysqld --no-defaults --console --basedir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test --character-sets-dir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/sql/share/charsets --secure-file-priv=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var --log-bin-trust-function-creators --default-character-set=latin1 --language=/private/tmp/andrei/mysql-5.1.19-beta-pb463/sql/share/english --tmpdir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/tmp --pid-file=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/run/slave.pid --port=12002 --socket=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/tmp/slave.sock --datadir=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/slave-data --log-output=table,file --log=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/slave.log --log-slow-queries=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/slave-slow.log --init-rpl-role=slave --log-bin=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/slave-bin --log-slave-updates --master-retry-count=10 --relay-log=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/var/log/slave-relay-bin --report-host=127.0.0.1 --report-port=12002 --report-user=root --loose-skip-innodb --skip-slave-start --slave-load-tmpdir=../tmp --set-variable=slave_net_timeout=10 --master-user=root --master-connect-retry=1 --master-host=127.0.0.1 --master-password= --master-port=12000 --server-id=2 --rpl-recovery-rank=2 --loose-skip-ndbcluster --key_buffer_size=1M --sort_buffer=256K --max_heap_table_size=1M --ssl-ca=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/cacert.pem --ssl-cert=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/server-cert.pem --ssl-key=/private/tmp/andrei/mysql-5.1.19-beta-pb463/mysql-test/std_data/server-key.pem --gdb

   3b_3. exec mysql-test-run.pl --extern rpl_ssl   (might take several times)

   In the end the sacred lines:

070531 18:33:31 [Warning] "SELECT UNIX_TIMESTAMP()" failed on master, do not trust column Seconds_Behind_Master of SHOW SLAVE STATUS. Error:  (1159)
socket_wrapper.cpp:117: failed assertion `socket_ != INVALID_SOCKET'

Program received signal SIGABRT, Aborted.
[31 May 2007 19:28] Sveta Smirnova
Thank you for the report.

Verified as described on Intel Mac.
[31 May 2007 19:31] Sveta Smirnova
But other assert failure on Intel Mac:

cat var/log/slave.err
...
viosocket.c:277: failed assertion `vio->sd >= 0'
070531 22:25:12 - mysqld got signal 6;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_connections=100
threads_connected=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 39420 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Writing a core file
[13 Jun 2007 9:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/28633

ChangeSet@1.2520, 2007-06-13 13:56:33+05:00, svoj@mysql.com +1 -0
  BUG#28812 - rpl_ssl fails due to assert in
              extra/yassl/src/socket_wrapper.cpp:117
  
  In rare case STOP SLAVE may cause server crash/assertion failures.
  
  This happens because connection to master is closed twice. With this
  patch connection to master is closed only once.
  
  This is pure replication problem, which may happen with or without
  yaSSL on any platform. yaSSL + OS X just makes it more probable.
  
  No test case for this fix, because it is covered by our test suite.
[15 Jun 2007 16:47] Sergey Vojtovich
My first patch was rejected because active_vio is intentional here. It is used to break connection on some platforms.

As discussed, I attempted to remove relevant assertions from vio and yaSSL. Now server simply crashes. The problem is that yaSSL doesn't support non-synchronized access to the same ssl object.

For details see: http://www.yassl.com/misc.html#threading

The same issue seem to be true with OpenSSL:
http://www.openssl.org/support/faq.html#PROG1

I need to get in touch with Todd Ouska to discuss if it is doable to allow non-synchronized access to the same ssl object. At least for SSL_shutdown. If it is not, either we will have to find another way to break a connection or serialize ssl vio.
[27 Jun 2007 7:59] Sergey Vojtovich
Back to verified since it is not obvious how to fix it yet.
[31 Jul 2007 7:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31874

ChangeSet@1.2487, 2007-07-31 16:25:26+05:00, svoj@mysql.com +1 -0
  BUG#28812 - rpl_ssl fails due to assert in
              extra/yassl/src/socket_wrapper.cpp:117
  
  yaSSL related part of this problem was fixed by Damien (see BUG29579).
  
  MySQL server part of this problem is:
  Issuing STOP SLAVE may cause assertion failure in debug builds on following
  platforms: HPUX, MacOS X, Windows.
  
  Fixed by disabling assertion on these platforms.
[31 Jul 2007 8:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31876

ChangeSet@1.2487, 2007-07-31 17:04:24+05:00, svoj@mysql.com +1 -0
  BUG#28812 - rpl_ssl fails due to assert in
              extra/yassl/src/socket_wrapper.cpp:117
  
  yaSSL related part of this problem was fixed by Damien (see BUG29579).
  
  MySQL server part of this problem is:
  Issuing STOP SLAVE may cause assertion failure in debug builds on following
  platforms: HPUX, MacOS X, Windows.
  
  Fixed by disabling assertion on these platforms.
[24 Aug 2007 15:02] Magnus Blåudd
The new versions of sslaccept and sslconnect

int sslaccept(struct st_VioSSLFd *ptr, Vio *vio, long timeout)
{
  DBUG_ENTER("sslaccept");
  DBUG_RETURN(sslconnect(ptr, vio, timeout));
}

int sslconnect(struct st_VioSSLFd *ptr, Vio *vio, long timeout)
{
  SSL *ssl;
  my_bool unused;
  my_bool was_blocking;

  DBUG_ENTER("sslconnect");
  DBUG_PRINT("enter", ("ptr: 0x%lx, sd: %d  ctx: 0x%lx",
                       (long) ptr, vio->sd, (long) ptr->ssl_context));

  /* Set socket to blocking if not already set */
  vio_blocking(vio, 1, &was_blocking);

  if (!(ssl= SSL_new(ptr->ssl_context)))
  {
    DBUG_PRINT("error", ("SSL_new failure"));
    report_errors(ssl);
    vio_blocking(vio, was_blocking, &unused);
    DBUG_RETURN(1);
  }
  DBUG_PRINT("info", ("ssl: 0x%lx timeout: %ld", (long) ssl, timeout));
  SSL_clear(ssl);
  SSL_SESSION_set_timeout(SSL_get_session(ssl), timeout);
  SSL_set_fd(ssl, vio->sd);

  /*
    SSL_do_handshake will select between SSL_connect
    or SSL_accept depending on server or client side
  */
  if (SSL_do_handshake(ssl) < 1)
  {
    DBUG_PRINT("error", ("SSL_do_handshake failure"));
    report_errors(ssl);
    SSL_free(ssl);
    vio_blocking(vio, was_blocking, &unused);
    DBUG_RETURN(1);
  }

  /*
    Connection suceeded. Install new function handlers,
    change type, set sd to the fd used when connecting
    and set pointer to the SSL structure
  */
  vio_reset(vio, VIO_TYPE_SSL, SSL_get_fd(ssl), 0, 0);
  vio->ssl_arg= (void*)ssl;

#ifndef DBUG_OFF
  {
    /* Print some info about the peer */
    X509 *cert;
    char buf[512];

    DBUG_PRINT("info",("SSL connection suceeded"));
    DBUG_PRINT("info",("Using cipher: '%s'" , SSL_get_cipher_name(ssl)));

    if ((cert= SSL_get_peer_certificate (ssl)))
    {
      DBUG_PRINT("info",("Peer certificate:"));
      X509_NAME_oneline(X509_get_subject_name(cert), buf, sizeof(buf));
      DBUG_PRINT("info",("\t subject: '%s'", buf));
      X509_NAME_oneline(X509_get_issuer_name(cert), buf, sizeof(buf));
      DBUG_PRINT("info",("\t issuer: '%s'", buf));
      X509_free(cert);
    }
    else
      DBUG_PRINT("info",("Peer does not have certificate."));

    if (SSL_get_shared_ciphers(ssl, buf, sizeof(buf)))
    {
      DBUG_PRINT("info",("shared_ciphers: '%s'", buf));
    }
    else
      DBUG_PRINT("info",("no shared ciphers!"));
  }
#endif

  DBUG_RETURN(0);
}
[24 Aug 2007 18:13] Timothy Smith
Looks good, Magnus.  The structure of sslconnect() is much clearer, too.
[28 Aug 2007 8:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33202

ChangeSet@1.2517, 2007-08-28 10:58:37+02:00, msvensson@pilot.(none) +2 -0
  Bug#28812 rpl_ssl fails due to assert in extra/yassl/src/socket_wrapper.cpp:117
   - Merge sslaccept and sslconnect.
   - Atomically "reset" vio to VIO_TYPE_SSL when the SSL connection has
     succeeded, this avoids having to revert anything and thus protects
     against "close_active_vio" in the middle.
   - Add some variance to the testcase
[28 Aug 2007 9:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/33204

ChangeSet@1.2517, 2007-08-28 11:34:43+02:00, msvensson@pilot.(none) +2 -0
  Bug#28812 rpl_ssl fails due to assert in extra/yassl/src/socket_wrapper.cpp:117
   - Merge sslaccept and sslconnect.
   - Atomically "reset" vio to VIO_TYPE_SSL when the SSL connection has
     succeeded, this avoids having to revert anything and thus protects
     against "close_active_vio" in the middle.
   - Add some variance to the testcase
[4 Sep 2007 17:11] Bugs System
Pushed into 5.1.23-beta
[4 Sep 2007 17:13] Bugs System
Pushed into 5.0.50
[14 Sep 2007 21:00] Paul DuBois
Noted in 5.0.50, 5.1.23 changelogs.

Killing an SSL connection on platforms where MySQL is compiled with
-DSIGNAL_WITH_VIO_CLOSE (Windows, Mac OS X, and some others) could
crash the server.