Phoronix pts fio io_uring test regression report on upstream v6.1 and v5.15

From: Saeed Mirzamohammadi
Date: Thu Jan 19 2023 - 16:57:57 EST


Hello,

I'm reporting a performance regression after the commit below on phoronix pts/fio test and with the config that is added in the end of this email:

Link: https://lore.kernel.org/all/20210913131123.597544850@xxxxxxxxxxxxxxxxxxx/

commit 7b3188e7ed54102a5dcc73d07727f41fb528f7c8
Author: Jens Axboe axboe@xxxxxxxxx
Date: Mon Aug 30 19:37:41 2021 -0600

io_uring: IORING_OP_WRITE needs hash_reg_file set

We observed regression on the latest v6.1.y and v5.15.y upstream kernels (Haven't tested other stable kernels). We noticed that performance regression improved 45% after the revert of the commit above.

All of the benchmarks below have experienced around ~45% regression.
phoronix-pts-fio-1.15.0-RandomWrite-EngineIO_uring-BufferedNo-DirectYes-BlockSize4KB-MB-s_xfs
phoronix-pts-fio-1.15.0-SequentialWrite-EngineIO_uring-BufferedNo-DirectYes-BlockSize4KB-MB-s_xfs
phoronix-pts-fio-1.15.0-SequentialWrite-EngineIO_uring-BufferedYes-DirectNo-BlockSize4KB-MB-s_xfs

We tend to see this regression on 4KB BlockSize tests.

We tried out changing force_async but that has no effect on the result. Also, backported a modified version of the patch mentioned here (https://lkml.org/lkml/2022/7/20/854) but that didn't affect performance.

Do you have any suggestions on any fixes or what else we can try to narrow down the issue?

Thanks a bunch,
Saeed
--------

Here is more info on the benchmark and system:

Here is the config for fio:
[global]
rw=randwrite
ioengine=io_uring
iodepth=64
size=1g
direct=1
buffered=0
startdelay=5
force_async=4
ramp_time=5
runtime=20
time_based
disk_util=0
clat_percentiles=0
disable_lat=1
disable_clat=1
disable_slat=1
filename=/data/fiofile
[test]
name=test
bs=4k
stonewall

df -Th output (file is on /data/):
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 252G 0 252G 0% /dev
tmpfs tmpfs 252G 0 252G 0% /dev/shm
tmpfs tmpfs 252G 18M 252G 1% /run
tmpfs tmpfs 252G 0 252G 0% /sys/fs/cgroup
/dev/mapper/ocivolume-root xfs 89G 17G 73G 19% /
/dev/mapper/ocivolume-oled xfs 10G 143M 9.9G 2% /var/oled
/dev/sda2 xfs 1014M 643M 372M 64% /boot
/dev/sda1 vfat 100M 5.0M 95M 6% /boot/efi
tmpfs tmpfs 51G 0 51G 0% /run/user/0
tmpfs tmpfs 51G 0 51G 0% /run/user/987
/dev/mapper/tank-lvm xfs 100G 1.8G 99G 2% /data