Bug #86818 Connector/Python with Prepared Statement Leads to Data Inconsistency
Submitted: 23 Jun 2017 19:17 Modified: 11 Apr 2018 7:14
Reporter: Jun Ma Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / Python Severity:S2 (Serious)
Version:5.7.18 OS:Ubuntu (14.04.1-Ubuntu)
Assigned to: CPU Architecture:Any

[23 Jun 2017 19:17] Jun Ma
Description:
We notice that using python connector with the prepared statement (Connector/Python or Oursql) to insert or update a varchar field with python datetime object will lead into data inconsistent between master and slave. On master there's no microseconds value, but on the slave and in binlog the value actually has the microseconds (see blowing for exact commands).

How to repeat:
1. Create table

*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `value` varchar(256) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8

2. Use Python Mysql connector:

>>> import mysql.connector
>>> from datetime import datetime
>>> cnx = mysql.connector.connect(user='', password='', host='', database='test');
>>> curs = cnx.cursor(prepared=True);
>>> s = "INSERT INTO test (value) VALUES (?)"
>>> curs.execute(s, (datetime.now(),))
>>> curs.commit()

3. Check data on master

mysql> select * from test;
+----+---------------------+
| id | value               |
+----+---------------------+
|  1 | 2017-06-23 11:49:43 |
+----+---------------------+

4. Check data on slave

mysql> select * from test;
+----+----------------------------+
| id | value                      |
+----+----------------------------+
|  1 | 2017-06-23 11:49:43.188271 |
+----+----------------------------+

5. Check binlog on master

#170623 18:49:59 server id 176281025  end_log_pos 63462711 CRC32 0xbba47a03 	GTID	last_committed=186753	sequence_number=186768
SET @@SESSION.GTID_NEXT= '5e11d844-5625-11e7-802c-02420a81d5c1:186768'/*!*/;
# at 63462711
#170623 18:49:43 server id 176281025  end_log_pos 63462790 CRC32 0x09deb9ec 	Query	thread_id=1396698	exec_time=0	error_code=0
SET TIMESTAMP=1498243783/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=45/*!*/;
BEGIN
/*!*/;
# at 63462790
# at 63462822
#170623 18:49:43 server id 176281025  end_log_pos 63462822 CRC32 0xa62e52f5 	Intvar
SET INSERT_ID=1/*!*/;
#170623 18:49:43 server id 176281025  end_log_pos 63462958 CRC32 0xae88ecca 	Query	thread_id=1396698	exec_time=0	error_code=0
use `test`/*!*/;
SET TIMESTAMP=1498243783/*!*/;
INSERT INTO test (value) VALUES ('2017-06-23 11:49:43.188271')
/*!*/;
# at 63462958
#170623 18:49:59 server id 176281025  end_log_pos 63462989 CRC32 0xde41c6b5 	Xid = 6534429
COMMIT/*!*/;

We can see that the value on master doesn't have microseconds, but the value on the slave and in binlog both have microseconds.

Suggested fix:
I
[23 Jun 2017 20:30] Jun Ma
mysql-connector-python==2.1.5
[3 Jul 2017 14:30] Nuno Mariz
Hi Jun,
I was not able to reproduce the issue. There's no microseconds value both in master and slave after the script execution.
Furthermore Connector/Python only executes this statement on the master, for the slave is inserted by the replication mechanism. So, in this case they should not differ.
[3 Jul 2017 22:36] Jun Ma
Hi Nuno,

Thanks for looking into it.
First of all, I'm sorry that I have a typo in the python script: the very last command, instead of '>>> curs.commit()', it's actually should be '>>> cnx.commit()'. But this should not affect the result if the transaction is committed eventually.

Second, this bug only affects clusters of 'Server version: 5.7.18-log MySQL Community Server (GPL)'. I did the exact same test on clusters of 'Server version: 5.7.17-log MySQL Community Server (GPL)', and there's no data inconsistency between master and slave. In fact, I looked into the binlog of 5.7.17 server, it has:

#170703 22:04:46 server id 176281027  end_log_pos 152216 CRC32 0xc46ab439       Intvar
SET INSERT_ID=2/*!*/;
#170703 22:04:46 server id 176281027  end_log_pos 152345 CRC32 0xc7e2cdbb       Query   thread_id=2595  exec_time=0     error_code=0
use `test`/*!*/;
SET TIMESTAMP=1499119486/*!*/;
INSERT INTO test (value) VALUES ('2017-07-03 15:04:46')
/*!*/;

Note that in the binlog, the statement explicitly shows the time without microseconds, but if you look at the binlog in 5.7.18(in my original post), it actually has microseconds in the statement.
One fact is that if you look at python datetime object, the return value of datetime.now() has microseconds, and the value column that I'm using is `value` varchar(256) DEFAULT NULL, so I'm actually surprised that on the mysql server side it doesn't shows the microseconds, I think the value with microseconds is actually the correct value. 

I suspect that there's a bug fix in 5.7.18 that the binlog gets the right value, but somehow the server still stores the wrong value.

Third, this bug only happens when using Prepared Statement. 

I can 100% reproduce it, let me know if you need me to provide more information or anything that I can help you reproduce this.

Thanks,
Jun
[14 Jul 2017 23:34] Jun Ma
Is there any update about this ticket?
[9 Apr 2018 14:09] Nuno Mariz
Hi Jun,
I tried again and I was not able to reproduce the issue.
It's probably something related to your replication set up, since it worked with MySQL 5.7.17.
[11 Apr 2018 7:14] Chiranjeevi Battula
Hello  Jun,

Thank you for the report and feedback.
I could not repeat the issue at our end using with Connector / Python 2.1.7, MySQL 5.7.21 versions.

Thanks,
Chiranjeevi.