Re: [PATCH v2 4/7] dm: prevent DAX mounts if not supported

From: Mike Snitzer
Date: Fri Jun 01 2018 - 17:55:22 EST


On Tue, May 29 2018 at 3:51pm -0400,
Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote:

> Currently the code in dm_dax_direct_access() only checks whether the target
> type has a direct_access() operation defined, not whether the underlying
> block devices all support DAX. This latter property can be seen by looking
> at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
> DM device.

Wait... I thought DAX support was all or nothing?

> This is problematic if we have, for example, a dm-linear device made up of
> a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
> QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
> we have a working direct_access() entry point and the first member of the
> dm-linear set *does* support DAX.

If you don't have a uniformly capable device then it is very dangerous
to advertise that the entire device has a certain capability. That
completely bit me in the past with discard (because for every IO I
wasn't then checking if the destination device supported discards).

It is all well and good that you're adding that check here. But what I
don't like is how you're saying QUEUE_FLAG_DAX implies direct_access()
operation exists.. yet for raw PMEM namespaces we just discussed how
that is a lie.

SO this type of change showcases how the QUEUE_FLAG_DAX doesn't _really_
imply direct_access() exists.

> This allows the user to create a filesystem on the dm-linear device, and
> then mount it with DAX. The filesystem's bdev_dax_supported() test will
> pass because it'll operate on the first member of the dm-linear device,
> which happens to be a fsdax PMEM namespace.
>
> All DAX I/O will then fail to that dm-linear device because the lack of
> QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working. This means that
> the struct dax_device isn't ever set in the filesystem, so
> dax_direct_access() will always return -EOPNOTSUPP.

Now you've lost me... these past 2 paragraphs. Why can a user mount it
is DAX mode? Because bdev_dax_supported() only accesses the first
portion (which happens to have DAX capabilities?)

Isn't this exactly why you should be checking for QUEUE_FLAG_DAX in the
caller (bdev_dax_supported)? Why not use bdev_get_queue() and verify
QUEUE_FLAG_DAX is set in there?

> By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
> the filesystem know we don't support DAX at mount time. The filesystem
> will then silently fall back and remove the dax mount option, causing it to
> work properly.

This shouldn't be needed. Again, QUEUE_FLAG_DAX wasn't set.. so don't
allow code to falsely try operations that should've been gated by the
fact it wasn't set.

SO Nack on this patch.. until/unless I'm corrected ;)

Thanks,
Mike


> Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
> ---
> drivers/md/dm.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 0a7b0107ca78..9728433362d1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
>
> if (!ti)
> goto out;
> - if (!ti->type->direct_access)
> + if (!blk_queue_dax(md->queue))
> goto out;
> len = max_io_len(sector, ti) / PAGE_SECTORS;
> if (len < 1)
> goto out;
> nr_pages = min(len, nr_pages);
> - if (ti->type->direct_access)
> - ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
> + ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
>
> out:
> dm_put_live_table(md, srcu_idx);
> --
> 2.14.3
>