Bug #92823 ndb_import is very slow when importing in a non-empty table
Submitted: 17 Oct 2018 11:34 Modified: 19 Oct 2018 1:12
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:7.6.7 OS:Any
Assigned to: CPU Architecture:Any

[17 Oct 2018 11:34] Daniël van Eeden
Description:

$ wc -l t1.csv
18000 t1.csv

# Initial import
$ timeout 30s /bin/ndb_import test t1.csv --ndb-connectstring=myhost1 
job-1 import test.t1 from t1.csv
job-1 [starting] import test.t1 from t1.csv
job-1 [running] import test.t1 from t1.csv
job-1 [success] import test.t1 from t1.csv
job-1 imported 18000 rows in 0h0m0s at 28213 rows/s
jobs summary: defined: 1 run: 1 with success: 1 with failure: 0

# 2nd import
$ timeout 30s /bin/ndb_import test t1.csv --ndb-connectstring=myhost1 
job-1 import test.t1 from t1.csv
job-1 [starting] import test.t1 from t1.csv
job-1 [running] import test.t1 from t1.csv
$ 

Expected behaviour:
Both imports have a similar runtime. The second run should result in 100% import failures because of duplicate primary keys etc.

Actual behaviour:
The 2nd import runs for a long time... I canceled it after 30m.

Use case:
Run ndb_import on two files for the same table. e.g. merging data from multiple sources (e.g. shards).

How to repeat:
With less than 18000 rows the second run was quick.

Suggested fix:
The second run should have a similar runtime.
Also if there is a good reason for this slowdown then the output should be more informative.
[17 Oct 2018 11:37] Daniël van Eeden
The 30m was for another run than the one in the description
[17 Oct 2018 11:39] Daniël van Eeden
$ wc -l t1.*
  18000 t1.csv
      1 t1.map
   1001 t1.rej
      1 t1.res
  19003 total

The t1.rej has exactly 1000 entries and a header line
[17 Oct 2018 12:02] Daniël van Eeden
With a loglevel of 1:

...
util 3096 1.006: E error[data-0]: reject limit 0 exceeded (source:2724)
util 3043 1.006: E error[ndb-630]: Tuple already existed when attempting to insert (source:2956)
util 3096 1.006: E error[data-0]: reject limit 0 exceeded (source:2724)
util 3043 1.006: E error[ndb-630]: Tuple already existed when attempting to insert (source:2956)
util 3096 1.006: E error[data-0]: reject limit 0 exceeded (source:2724)
util 3043 1.006: E error[ndb-630]: Tuple already existed when attempting to insert (source:2956)
util 3096 1.006: E error[data-0]: reject limit 0 exceeded (source:2724)
util 3043 1.006: E error[ndb-630]: Tuple already existed when attempting to insert (source:2956)
util 3096 1.006: E error[data-0]: reject limit 0 exceeded (source:2724)
W exec-op-1 [running/wait tx:free=500,open=0] 3182 1.006: rowmap 0
W exec-op-1 [stop/receive tx:free=500,open=0] 2876 1.006: do_end/asynch
W exec-op-1 [stopped/receive tx:free=500,open=0] 1258 1.007: stopped
T-3 exec-op [started] error[data-0] 957 1.007: W exec-op-1 [stopped/receive tx:free=500,open=0]: wait for running
T-3 exec-op [started] error[data-0] 951 1.007: wait_worker: W exec-op-2 [stopped/prepare tx:free=500,open=0] for running
T-3 exec-op [started] error[data-0] 957 1.007: W exec-op-2 [stopped/prepare tx:free=500,open=0]: wait for running
T-3 exec-op [started] error[data-0] 951 1.007: wait_worker: W exec-op-3 [stopped/prepare tx:free=500,open=0] for running
T-3 exec-op [started] error[data-0] 957 1.007: W exec-op-3 [stopped/prepare tx:free=500,open=0]: wait for running
J-1 [starting] 652 1.007: teams started
T-3 exec-op [stop] error[data-0] 1036 1.019: do_stop
T-3 exec-op [stop] error[data-0] 1076 1.019: stop_worker: W exec-op-0 [stopped/prepare tx:free=500,open=0]
T-3 exec-op [stop] error[data-0] 1076 1.019: stop_worker: W exec-op-1 [stopped/receive tx:free=500,open=0]
T-3 exec-op [stop] error[data-0] 1076 1.019: stop_worker: W exec-op-2 [stopped/prepare tx:free=500,open=0]
T-3 exec-op [stop] error[data-0] 1076 1.019: stop_worker: W exec-op-3 [stopped/prepare tx:free=500,open=0]
T-3 exec-op [stop] error[data-0] 940 1.019: wait_workers
T-3 exec-op [stop] error[data-0] 951 1.019: wait_worker: W exec-op-0 [stopped/prepare tx:free=500,open=0] for stopped
T-3 exec-op [stop] error[data-0] 957 1.020: W exec-op-0 [stopped/prepare tx:free=500,open=0]: wait for stopped
T-3 exec-op [stop] error[data-0] 951 1.020: wait_worker: W exec-op-1 [stopped/receive tx:free=500,open=0] for stopped
T-3 exec-op [stop] error[data-0] 957 1.020: W exec-op-1 [stopped/receive tx:free=500,open=0]: wait for stopped
T-3 exec-op [stop] error[data-0] 951 1.020: wait_worker: W exec-op-2 [stopped/prepare tx:free=500,open=0] for stopped
T-3 exec-op [stop] error[data-0] 957 1.020: W exec-op-2 [stopped/prepare tx:free=500,open=0]: wait for stopped
T-3 exec-op [stop] error[job-1 [running] import test.t1data-0] 951 1.020: wait_worker: W exec-op-3 [stopped/prepare tx:free=500,open=0] for stopped
T-3 from test.csv exec-op [stop] error[data-0] 957 1.020: 
W exec-op-3 [stopped/prepare tx:free=500,open=0]: wait for stopped
T-3 exec-op [stop] error[data-0] 2575 1.021: do_end
[18 Oct 2018 5:57] Daniël van Eeden
Using --rejects=4294967295 seems to fix it for now.
[19 Oct 2018 1:12] Bogdan Kecman
Hi,

Verified as described, I used the same data I used for bug #92826 ..

The .rej file gets to 1001 and it stays like that but the ndb_import gets stuck forever.

Thanks for the report
Bogdan

[root@localhost mysql]# bin/ndb_import test /tmp/t4.csv --fields-optionally-enclosed-by='"' --fields-terminated-by="," --fields-escaped-by='\\'
job-1 import test.t4 from /tmp/t4.csv
job-1 [starting] import test.t4 from /tmp/t4.csv
job-1 [running] import test.t4 from /tmp/t4.csv
job-1 [success] import test.t4 from /tmp/t4.csv
job-1 imported 20000 rows in 0h0m1s at 16090 rows/s
jobs summary: defined: 1 run: 1 with success: 1 with failure: 0
[root@localhost mysql]# time bin/ndb_import test /tmp/t4.csv --fields-optionally-enclosed-by='"' --fields-terminated-by="," --fields-escaped-by='\\'
job-1 import test.t4 from /tmp/t4.csv
job-1 [starting] import test.t4 from /tmp/t4.csv
job-1 [running] import test.t4 from /tmp/t4.csv

[arhimed@localhost tmp]$ wc /usr/local//mysql/t4.rej
  1001  21011 115561 /usr/local//mysql/t4.rej
[arhimed@localhost tmp]$ wc /usr/local//mysql/t4.rej
  1001  21011 115561 /usr/local//mysql/t4.rej
[arhimed@localhost tmp]$ wc /usr/local//mysql/t4.rej
  1001  21011 115561 /usr/local//mysql/t4.rej
[arhimed@localhost tmp]$ wc /usr/local//mysql/t4.rej
  1001  21011 115561 /usr/local//mysql/t4.rej
[arhimed@localhost tmp]$ wc /usr/local//mysql/t4.rej
  1001  21011 115561 /usr/local//mysql/t4.rej
[arhimed@localhost tmp]$
[19 Oct 2018 1:14] Bogdan Kecman
[root@localhost mysql]# time bin/ndb_import test /tmp/t4.csv --fields-optionally-enclosed-by='"' --fields-terminated-by="," --fields-escaped-by='\\' --rejects=20001
job-1 import test.t4 from /tmp/t4.csv
job-1 [starting] import test.t4 from /tmp/t4.csv
job-1 [running] import test.t4 from /tmp/t4.csv
job-1 [running] import test.t4 from /tmp/t4.csv
job-1 rejected 20000 rows (limit 20001), see t4.rej
job-1 [success] import test.t4 from /tmp/t4.csv
job-1 imported 0 rows in 0h0m0s at 0 rows/s
job-1 rejected 20000 rows (limit 20001), see t4.rej
jobs summary: defined: 1 run: 1 with success: 1 with failure: 0

real    0m2.251s
user    0m0.373s
sys     0m0.147s
[root@localhost mysql]#