Description:
When create partition table with charset utf8, the data distribute is not good
How to repeat:
First, use an python script to generate data:
$ cat gen_data.py
file = open("/tmp/a.txt","w")
for i in range(1,10000):
file.write(str(i))
file.write('|')
file.write(str(i))
file.write('\n')
file.close()
phoenix@phoenix-Latitude-5491:~/test$ python gen_data.py
Then, connect to mysql, create 2 tables, and load data:
mysql> create table t1 (c1 varchar(64) primary key, c2 int) engine=innodb charset utf8 partition by key(c1) partitions 4;
Query OK, 0 rows affected, 1 warning (0.17 sec)
mysql> create table t2 (c1 varchar(64) primary key, c2 int) engine=innodb charset utf8mb4 partition by key(c1) partitions 4;
Query OK, 0 rows affected (0.22 sec)
mysql> load data infile '/tmp/a.txt' into table t1 fields terminated by '|';
Query OK, 9999 rows affected (3.50 sec)
Records: 9999 Deleted: 0 Skipped: 0 Warnings: 0
mysql> load data infile '/tmp/a.txt' into table t2 fields terminated by '|';
Query OK, 9999 rows affected (3.85 sec)
Records: 9999 Deleted: 0 Skipped: 0 Warnings: 0
Then, check data distribute, it quite bad for utf8:
mysql> select count(*) from t1 partition(p0);
+----------+
| count(*) |
+----------+
| 3376 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t1 partition(p1);
+----------+
| count(*) |
+----------+
| 312 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t1 partition(p2);
+----------+
| count(*) |
+----------+
| 5999 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t1 partition(p3);
+----------+
| count(*) |
+----------+
| 312 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t2 partition(p0);
+----------+
| count(*) |
+----------+
| 2500 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t2 partition(p1);
+----------+
| count(*) |
+----------+
| 2500 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from t2 partition(p2);
+----------+
| count(*) |
+----------+
| 2500 |
+----------+
1 row in set (0.02 sec)
mysql> select count(*) from t2 partition(p3);
+----------+
| count(*) |
+----------+
| 2499 |
+----------+
1 row in set (0.00 sec)