Bug #116501 Compression not work with file system which sector size is not 512
Submitted: 29 Oct 2024 13:27 Modified: 7 Nov 2024 15:13
Reporter: zkong kong Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Linux
Assigned to: CPU Architecture:Any

[29 Oct 2024 13:27] zkong kong
Description:
When create compression table on a file system which sector size is not 512. 
Got error message in log file:

2024-10-29T19:16:13.763289+08:00 0 [ERROR] [MY-012639] [InnoDB] Write to file ./sysbench/sbtest1.ibd failed at offset 16384, 16384 bytes should have been written, only 0 were written. Operating system error number 22. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
2024-10-29T19:16:13.763333+08:00 0 [ERROR] [MY-012640] [InnoDB] Error number 22 means 'Invalid argument'
2024-10-29T19:16:13.763345+08:00 0 [Note] [MY-012641] [InnoDB] Refer to your operating system documentation for operating system error code information.
2024-10-29T19:16:13.763362+08:00 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: os0file.cc:7109:err == DB_SUCCESS || err == DB_IO_NO_PUNCH_HOLE thread 139783871395584

After some investigation, found that pwrite call faild with errno 22.

$3 = 16
(gdb) p buf
$4 = (void *) 0x7fd616085a00
(gdb) p n 
$5 = 12288
(gdb) p offset
$6 = 753664
(gdb) p buf
$7 = (void *) 0x7fd616085a00 // not aligned with 4k, just 512

mysql> show variables like '%innodb_flush_method%';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| innodb_flush_method | O_DIRECT |
+---------------------+----------+
1 row in set (0.01 sec)

[root ~]# fdisk -l /dev/md0p1

Disk /dev/md0p1: 53.7 GB, 53687091200 bytes, 13107200 sectors
Units = sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2097152 bytes

The sector size of the file system is 4K, so the direct io failed with the buf size is not aligned with 4K.

the code:
compressed_page =
    static_cast<byte *>(ut_align(block->m_ptr, UNIV_SECTOR_SIZE)); // 512

How to repeat:
Sysbench write compression table on a file system with sector size 4K, the file system must support sparse file and punching hole.

/data1/ragnar/tools/sysbench/bin/sysbench /data1/ragnar/tools/sysbench/share/sysbench/oltp_insert.lua --mysql-host=9.30.5.83 --mysql-port=4306 --mysql-db=sysbench --tables=1 --table-size=10000 --mysql-user='test1234' --mysql-password='test1234' --threads=128 --report-interval=1 --time=360000 --create-table-options="compression='zlib'" prepare

/data1/ragnar/tools/sysbench/bin/sysbench /data1/ragnar/tools/sysbench/share/sysbench/oltp_insert.lua --mysql-host=9.30.5.83 --mysql-port=4306 --mysql-db=sysbench --tables=1 --table-size=10000 --mysql-user='test1234' --mysql-password='test1234' --threads=128 --report-interval=1 --time=360000 --create-table-options="compression='zlib'" run

Suggested fix:
Adjust the sector size to actual size of the devices.
[29 Oct 2024 14:00] MySQL Verification Team
HI Mr. kong,

Thank you for your bug report.

However, we can not repeat your problem, since we do not have a test case.

Hence, we need the entire configuration of MySQL 9.1.0. Next we need exactly how did you create the filesytem  , which type and version,  with exact parameters, OS and OS version.

Last, we need the full CREATE TABLE statement. 

Beside that, we need to know whether error occurs on CREATE table or on INSERT or any other command.

We can not start procesing this report, until we have a fully repeatable test case.

Can't repeat.
[30 Oct 2024 12:26] zkong kong
Hi:
  The config file

[mysqld]

core-file
innodb_buffer_pool_in_core_file=off
bind_address    = 9.30.5.83
port      = 4306
report-host=9.30.5.83
admin_address=9.30.5.83
admin_port = 14306

local_infile=0
skip-symbolic-links

socket      = /data1/4306/prod/mysql.sock
pid-file    = /data1/4306/prod/mysql.pid

server-id=   1094071519

plugin-dir=/data4/mysql-server-8.0.24/lib/plugin

user=tdsql

datadir     = /data1/4306/dbdata_raw/data

log-bin  = /data3/log/4306/dblogs/bin/binlog.log
relay-log = /data3/log/4306/dblogs/relay/relay.log
log-error = /data3/log/4306/dblogs/mysqld.err
log_error_verbosity=3

binlog_format=row
binlog_expire_logs_seconds=0

default-storage-engine = innodb
lc-messages-dir     = /data4/mysql-server-8.0.24/share
character-set-server = utf8mb4
collation-server = utf8mb4_general_ci

innodb_log_group_home_dir = /data3/log/4306/logs/innodb
tmpdir = /data1/4306/dbdata_raw/tmpdir
innodb_data_home_dir = /data1/4306/dbdata_raw/dbdata

slow_query_log=ON
table_open_cache = 20480
table_open_cache_instances=32
tablespace_definition_cache=10240
table_definition_cache=10240
innodb_open_files   =   10240
open_files_limit    =   100000
max_prepared_stmt_count=200000

performance_schema=on
performance_schema_max_digest_length = 16384
performance_schema_max_sql_text_length = 16384

enforce_gtid_consistency = ON
gtid_mode = ON
skip-slave-start
slave_skip_errors=1397
slave_preserve_commit_order=ON

net_read_timeout=150

transaction-isolation=READ-COMMITTED

log_timestamps=SYSTEM
binlog_checksum=NONE
default_tmp_storage_engine=innodb
innodb_data_file_path=ibdata1:2G:autoextend
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_page_cleaners=16
innodb_flush_log_at_trx_commit=1
innodb_buffer_pool_size = 10G
innodb_buffer_pool_instances = 2
innodb_log_buffer_size = 268435456
join_buffer_size = 2097152
read_rnd_buffer_size = 2097152
sort_buffer_size = 2097152
read_buffer_size = 2097152
key_buffer_size = 134217728
max_connections = 10000
innodb_io_capacity  =   10000
innodb_io_capacity_max  =   20000
innodb_log_files_in_group   =   4
innodb_log_file_size    =   1073741824
innodb_max_dirty_pages_pct  =   70
innodb_max_purge_lag    =   500000
innodb_purge_batch_size =   1000
innodb_thread_concurrency   =   0
innodb_doublewrite=1
innodb_buffer_pool_dump_at_shutdown=OFF
innodb_buffer_pool_load_at_startup=OFF
innodb_temp_data_file_path=ibtmp1:50M:autoextend:max:30G
skip_name_resolve=on

log_bin_trust_function_creators=ON
log_statements_unsafe_for_binlog = off

lock_wait_timeout   =   5
long_query_time =   1
lower_case_table_names  =   1
max_allowed_packet  =   1073741824
max_binlog_size =   104857600
max_connect_errors  =   2000
max_heap_table_size =   33554432
max_relay_log_size  =   104857600
myisam_sort_buffer_size =   4194304
query_alloc_block_size  =   16384
#query_cache_size    =   0
query_prealloc_size =   24576
slave_net_timeout   =   3600

log_slave_updates=on
slave_parallel_type=logical_clock
slave_parallel_workers=64

innodb_sync_array_size=64
sync_master_info    =   0
sync_relay_log  =   0
sync_relay_log_info =   0
thread_cache_size   =   8

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

range_optimizer_max_mem_size=83886080
slave_pending_jobs_size_max=1677721600

#mysqlx: disable it by default
mysqlx=0

innodb_use_native_aio=0
innodb_flush_method=O_DIRECT

character_set_server=utf8
collation_server=utf8_general_ci
lower_case_table_names=1
[30 Oct 2024 12:29] zkong kong
The table is create by sysbench prepare add compression option

CREATE TABLE `sbtest1` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=982156 DEFAULT CHARSET=utf8mb3 COMPRESSION='zlib';

The cammond to create the file system:

mkfs.xfs -f /dev/md0p1
[30 Oct 2024 12:30] zkong kong
Hi:
  The error occurs when run the sysbench insert commands.

/data1/ragnar/tools/sysbench/bin/sysbench /data1/ragnar/tools/sysbench/share/sysbench/oltp_insert.lua --mysql-host=9.30.5.83 --mysql-port=4306 --mysql-db=sysbench --tables=1 --table-size=10000 --mysql-user='test1234' --mysql-password='test1234' --threads=128 --report-interval=1 --time=360000 --create-table-options="compression='zlib'" run
[30 Oct 2024 12:32] zkong kong
Please don't simply change it to Can't repeat, It can be reproduced very easilly.
Please confirm it, trx very mouch!
[30 Oct 2024 13:44] MySQL Verification Team
Hi Mr. kong,

There are some data that you have not provided us with, like:

* We need exactly how did you create the filesytem  , 

* Which Linux versions and release did you use 

* Which type and version of filesystem,  with exact parameters

* We need to know whether error occurs on CREATE table or on INSERT or any other command. 

If we understood you correctly, the error occurs during INSERT ....... let us know whether we are correct in our presumption.
[31 Oct 2024 2:19] zkong kong
1.
* We need exactly how did you create the filesytem  , 

fdisk /dev/md0

Separate 50G data partition see the my.cnf and logs  store in separated partition

[root@TENCENT64 ~]# lsblk
NAME      MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
nvme0n1   259:2    0  1.7T  0 disk  
└─md0       9:0    0  6.6T  0 raid0 
  ├─md0p4 259:11   0  6.4T  0 md    /data4
  ├─md0p2 259:9    0   50G  0 md    /data2
  ├─md0p3 259:10   0  100G  0 md    /data3
  └─md0p1 259:8    0   50G  0 md    /data1
nvme3n1   259:3    0  1.7T  0 disk  
└─md0       9:0    0  6.6T  0 raid0 
  ├─md0p4 259:11   0  6.4T  0 md    /data4
  ├─md0p2 259:9    0   50G  0 md    /data2
  ├─md0p3 259:10   0  100G  0 md    /data3
  └─md0p1 259:8    0   50G  0 md    /data1
nvme2n1   259:1    0  1.7T  0 disk  
└─md0       9:0    0  6.6T  0 raid0 
  ├─md0p4 259:11   0  6.4T  0 md    /data4
  ├─md0p2 259:9    0   50G  0 md    /data2
  ├─md0p3 259:10   0  100G  0 md    /data3
  └─md0p1 259:8    0   50G  0 md    /data1
nvme1n1   259:0    0  1.7T  0 disk  
└─md0       9:0    0  6.6T  0 raid0 
  ├─md0p4 259:11   0  6.4T  0 md    /data4
  ├─md0p2 259:9    0   50G  0 md    /data2
  ├─md0p3 259:10   0  100G  0 md    /data3
  └─md0p1 259:8    0   50G  0 md    /data1
sda         8:0    0  4.9T  0 disk  
├─sda4      8:4    0  4.9T  0 part  /data
├─sda2      8:2    0  512M  0 part  /boot/efi
├─sda3      8:3    0   20G  0 part  /usr/local
└─sda1      8:1    0   20G  0 part  /

Use mkfs.xfs -f /dev/md0p1 to create xfs filesystem

[root@TENCENT64 ~]# fdisk -l /dev/md0p1

Disk /dev/md0p1: 53.7 GB, 53687091200 bytes, 13107200 sectors
Units = sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 2097152 bytes

/data1 /data2/ /data3 is the same create commands.

[root@TENCENT64 ~]# df -T /data1
Filesystem     Type 1K-blocks     Used Available Use% Mounted on
/dev/md0p1     xfs   52395012 50321828   2073184  97% /data1

* Which Linux versions and release did you use 

Os is our company version and kernel is version 3

* Which type and version of filesystem,  with exact parameters

Filesystem is XFS, no special parameters: 

mkfs.xfs -f /dev/md0p1

* We need to know whether error occurs on CREATE table or on INSERT or any other command. 

The error is occurs when run sysbench benchmark. It's happened when the page cleaner flush the compress page to disk. The compress page buffer is not aligned by 4K but 512 which it's not support when use direct io.

If we understood you correctly, the error occurs during INSERT ....... let us know whether we are correct in our presumption.
[31 Oct 2024 11:09] MySQL Verification Team
Hi Mr. Kong,

We are waiting on the Linux box to be prepared for this bug report and for the one that follows this   one.
[7 Nov 2024 15:02] MySQL Verification Team
Hi Mr. kong,

We have managed to reproduce your bug.

This is the info from our filesystem:

Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 1048576 bytes
[7 Nov 2024 15:13] MySQL Verification Team
Hi Mr. kong,

We have also got errno 22 with sysbench.

This is now a verified bug report for versions 8.0, 8.4, 9.0 and 9.1.

Thank you ......