Re: [PATCH v2 0/8] fsdax,xfs: fix warning messages

From: Shiyang Ruan
Date: Thu Dec 29 2022 - 03:23:33 EST




在 2022/12/3 9:21, Dan Williams 写道:
Shiyang Ruan wrote:
Changes since v1:
1. Added a snippet of the warning message and some of the failed cases
2. Separated the patch for easily review
3. Added page->share and its helper functions
4. Included the patch[1] that removes the restrictions of fsdax and reflink
[1] https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@xxxxxxxxxxx/

...

This also effects dax+noreflink mode if we run the test after a
dax+reflink test. So, the most urgent thing is solving the warning
messages.

With these fixes, most warning messages in dax_associate_entry() are
gone. But honestly, generic/388 will randomly failed with the warning.
The case shutdown the xfs when fsstress is running, and do it for many
times. I think the reason is that dax pages in use are not able to be
invalidated in time when fs is shutdown. The next time dax page to be
associated, it still remains the mapping value set last time. I'll keep
on solving it.

This one also sounds like it is going to be relevant for CXL PMEM, and
the improvements to the reference counting. CXL has a facility where the
driver asserts that no more writes are in-flight to the device so that
the device can assert a clean shutdown. Part of that will be making sure
that page access ends at fs shutdown.

I was trying to locate the root cause of the fail on generic/388. But since it's a fsstress test, I can't relpay the operation sequence to help me locate the operations. So, I tried to replace fsstress with fsx, which can do replay after the case fails, but it can't reproduce the fail. I think another important factor is that fsstress tests with multiple threads. So, for now, it's hard for me to locate the cause by running the test.

Then I updated the kernel to the latest v6.2-rc1 and run generic/388 for many times. The warning dmesg doesn't show any more.

How is your test on this case? Does it still fail on the latest kernel? If so, I think I have to keep on locating the cause, and need your advice.


--
Thanks,
Ruan.


The warning message in dax_writeback_one() can also be fixed because of
the dax unshare.


Shiyang Ruan (8):
fsdax: introduce page->share for fsdax in reflink mode
fsdax: invalidate pages when CoW
fsdax: zero the edges if source is HOLE or UNWRITTEN
fsdax,xfs: set the shared flag when file extent is shared
fsdax: dedupe: iter two files at the same time
xfs: use dax ops for zero and truncate in fsdax mode
fsdax,xfs: port unshare to fsdax
xfs: remove restrictions for fsdax and reflink

fs/dax.c | 220 +++++++++++++++++++++++++------------
fs/xfs/xfs_ioctl.c | 4 -
fs/xfs/xfs_iomap.c | 6 +-
fs/xfs/xfs_iops.c | 4 -
fs/xfs/xfs_reflink.c | 8 +-
include/linux/dax.h | 2 +
include/linux/mm_types.h | 5 +-
include/linux/page-flags.h | 2 +-
8 files changed, 166 insertions(+), 85 deletions(-)

--
2.38.1