Description:
When swapping a staging table into production, the rename is not atomic across all nodes. The sql node that the query originates from is atomic, however when the rename is broadcasted to all other sql nodes, they are not atomic and cause subsequent lookups to error with "table doesn't exist" during the duration of the table swap.
How to repeat:
This works and is the expected effect, only works when everything is happening on the same node:
# ndb sql node sql01
#1) create tables on the ndb cluster
sql01-mysql> create table foo (bar varchar(100)) engine=ndb;
Query OK, 0 rows affected (0.79 sec)
sql01-mysql> create table foo_stg (bar varchar(100)) engine=ndb;
Query OK, 0 rows affected (0.74 sec)
#2) run on sql01
sql01:~$ while [ true ]; do mysql -h sql01 -e 'select * from foo'; done
#3) in a separate session with the loop continuing to run: swap tables on sql01
sql01-mysql> rename table foo to foo_trash, foo_stg to foo;
# The output from step 2 does not have any errors. Everything is working fine.
As soon as we move the select statements to occur on a second sql node, we run into the problem:
2 ndb sql nodes (sql01 and sql02) running in multiple sessions
#1) create tables on the ndb cluster
sql01-mysql> create table foo (bar varchar(100)) engine=ndb;
Query OK, 0 rows affected (0.79 sec)
sql01-mysql> create table foo_stg (bar varchar(100)) engine=ndb;
Query OK, 0 rows affected (0.74 sec)
#2) run on sql02
sql02:~$ while [ true ]; do mysql -h sql02 -e 'select * from foo'; done
#3) in a separate session with the loop continuing to run: swap tables on sql01
sql01-mysql> rename table foo to foo_trash, foo_stg to foo;
# The output from step 2 shows error messages stating the table no longer exists, for the duration of the table swap. This is the problem.
ERROR 1146 (42S02) at line 1: Table 'foo' doesn't exist
ERROR 1146 (42S02) at line 1: Table 'foo' doesn't exist
ERROR 1146 (42S02) at line 1: Table 'foo' doesn't exist
ERROR 1146 (42S02) at line 1: Table 'foo' doesn't exist
ERROR 1146 (42S02) at line 1: Table 'foo' doesn't exist