Bug #99388 Mysql Cluster 8 - “Jumpy” Auto Increment Values
Submitted: 28 Apr 2020 12:15 Modified: 28 Apr 2020 19:43
Reporter: Daniel Hope Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:ndb-8.0.19 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any

[28 Apr 2020 12:15] Daniel Hope
Description:
Im using mysql cluster 8 - using PHP to insert data I keep running into a problem where the Ids jump and then "recover" I.E

SELECT `MSU_ID`, `MSU_Date_Created` FROM `Material_Spreadsheet_Upload` ORDER BY `MSU_ID` DESC

603     2020-04-28 10:55:13
602     2020-04-28 10:55:07
601     2020-04-28 10:55:05
600     2020-04-28 10:55:03
599     2020-04-28 10:55:00
598     2020-04-28 10:54:57

SELECT `MSU_ID`, `MSU_Date_Created` FROM `Material_Spreadsheet_Upload` ORDER BY `MSU_Date_Created` DESC 

100 2020-04-28 12:05:28
99  2020-04-28 11:54:49
98  2020-04-28 11:54:47
97  2020-04-28 11:54:45
96  2020-04-28 11:54:39

This is obviously a problem, Im not doing anything in-between inserting the data (I.E messing with auto_increment values) - Im also certian im not inserting these ids anywhere

Sinse the upgrade to mysql 8 my tables all have comments NDB_TABLE=READ_BACKUP=1 they enabled this by default according to the docs in Mysql 8 but im wondering if this is whats causing the issue

After the upgrade I noticed but then I had to write and run a bunch of SQL queries to fix the issue initially - but it keeps happening and im not sure how to prevent it

How to repeat:
No idea
[28 Apr 2020 12:58] Daniel Hope
So it would appear AUTO_INCREMENT values are out of sync between MySQL hosts ? 

Host 1:

SHOW TABLE STATUS FROM `DB` WHERE `name` LIKE 'Material_Spreadsheet_Upload';

mysql> SELECT `AUTO_INCREMENT`
    -> FROM  INFORMATION_SCHEMA.TABLES
    -> WHERE TABLE_SCHEMA = 'Trilex_Installation'
    -> AND   TABLE_NAME   = 'Material_Spreadsheet_Upload';
+----------------+
| AUTO_INCREMENT |
+----------------+
|            110 |
+----------------+

Host 2:

mysql> SELECT `AUTO_INCREMENT`
    -> FROM  INFORMATION_SCHEMA.TABLES
    -> WHERE TABLE_SCHEMA = 'Trilex_Installation'
    -> AND   TABLE_NAME   = 'Material_Spreadsheet_Upload';
+----------------+
| AUTO_INCREMENT |
+----------------+
|            640 |
+----------------+

Then tried Node 1: 

ANALYZE TABLE `Material_Spreadsheet_Upload`;

But that didn't fix it ?
[28 Apr 2020 18:32] MySQL Verification Team
Hi,

That is not a bug but a completely expected behavior. Each API node (mysql server is just one api node) will pre-allocate a number of auto-increment values to use in order to speed up inserts. If it would "ask" for a new number each time you insert that would create a too big of a latency. You can control how much they pre-allocate.
https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-options-variables.html#sysvar_ndb_au...

Thanks
Bogdan
[28 Apr 2020 19:30] Daniel Hope
There is a fair amount of documentation on the internet that suggests ordering by auto incrementing primary key for joining the latest record is reliable

I.E 

https://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group-mysql

But this change breaks that behaviour because if the saves are out of order I would need to start ordering by date to ensure that im joining the last record. 

Maybe I dont understand the upgrade notes, but at time of writing, its not in the upgrade notes

https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html

I had to roll my cluster back because I wasn't easily able to identify this setting (im no DBA though) and it has caused some data loss
[28 Apr 2020 19:43] Daniel Hope
Sorry wrong link, but here are some 

https://stackoverflow.com/questions/3619030/mysql-join-the-most-recent-row-only

https://stackoverflow.com/questions/725153/most-recent-record-in-a-left-join
[29 Apr 2020 15:06] MySQL Verification Team
Hi Daniel,

Sorry to say this but StackOverflow is not always a reliable source of info.

By the SQL standard, and by MySQL documentation, there is no promise that auto_increment values will be in order and without holes nor that they will not be reused.

Ordering by the auto_increment field is bad practice. If you need to order by insert time you need to create a date-time field to store insert time. The fact auto_inc fields were behaving a certain way is just a temporary gimmick and not something to take for granted. Just like the order of the unordered query etc etc.

You do not have to revert to an older version, you can set prefetch size to 1 as described in the documentation I linked. Inserts (especially bulk ones) will be a lot slower but will work the way you expect. 

Note that default values for different parameters do change trough time and you should never run default values for stuff you care about.

all best
Bogdan