Re: [PATCH v11 7/8] xfs: Implement ->notify_failure() for XFS

From: Dan Williams
Date: Fri Apr 08 2022 - 02:25:41 EST


On Tue, Mar 29, 2022 at 11:01 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> > @@ -1892,6 +1893,8 @@ xfs_free_buftarg(
> > list_lru_destroy(&btp->bt_lru);
> >
> > blkdev_issue_flush(btp->bt_bdev);
> > + if (btp->bt_daxdev)
> > + dax_unregister_holder(btp->bt_daxdev, btp->bt_mount);
> > fs_put_dax(btp->bt_daxdev);
> >
> > kmem_free(btp);
> > @@ -1939,6 +1942,7 @@ xfs_alloc_buftarg(
> > struct block_device *bdev)
> > {
> > xfs_buftarg_t *btp;
> > + int error;
> >
> > btp = kmem_zalloc(sizeof(*btp), KM_NOFS);
> >
> > @@ -1946,6 +1950,14 @@ xfs_alloc_buftarg(
> > btp->bt_dev = bdev->bd_dev;
> > btp->bt_bdev = bdev;
> > btp->bt_daxdev = fs_dax_get_by_bdev(bdev, &btp->bt_dax_part_off);
> > + if (btp->bt_daxdev) {
> > + error = dax_register_holder(btp->bt_daxdev, mp,
> > + &xfs_dax_holder_operations);
> > + if (error) {
> > + xfs_err(mp, "DAX device already in use?!");
> > + goto error_free;
> > + }
> > + }
>
> It seems to me that just passing the holder and holder ops to
> fs_dax_get_by_bdev and the holder to dax_unregister_holder would
> significantly simply the interface here.
>
> Dan, what do you think?

Yes, makes sense, just like the optional holder arguments to blkdev_get_by_*().

>
> > +#if IS_ENABLED(CONFIG_MEMORY_FAILURE) && IS_ENABLED(CONFIG_FS_DAX)
>
> No real need for the IS_ENABLED. Also any reason to even build this
> file if the options are not set? It seems like
> xfs_dax_holder_operations should just be defined to NULL and the
> whole file not supported if we can't support the functionality.
>
> Dan: not for this series, but is there any reason not to require
> MEMORY_FAILURE for DAX to start with?

Given that DAX ties some storage semantics to memory and storage
supports EIO I can see an argument to require memory_failure() for
DAX, and especially for DAX on CXL where hotplug is supported it will
be necessary. Linux currently has no facility to consult PCI drivers
about removal actions, so the only recourse for a force removed CXL
device is mass memory_failure().

>
> > +
> > + ddev_start = mp->m_ddev_targp->bt_dax_part_off;
> > + ddev_end = ddev_start +
> > + (mp->m_ddev_targp->bt_bdev->bd_nr_sectors << SECTOR_SHIFT) - 1;
>
> This should use bdev_nr_bytes.
>
> But didn't we say we don't want to support notifications on partitioned
> devices and thus don't actually need all this?

Right.