Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

From: Christoph Hellwig
Date: Sat Jan 21 2017 - 12:52:40 EST


On Sat, Jan 21, 2017 at 04:28:52PM +0000, Matthew Wilcox wrote:
> Of course, there may not be a backing device either!

s/backing device/block device/ ? If so fully agreed. I like the dax_ops
scheme, but we should go all the way and detangle it from the block
device. I already brought up this issue with the fallback to direct I/O
on I/O error series.

> I see two possible routes here:
>
> 1. Add a new address_space_operation:
>
> const struct dax_operations *(*get_dax_ops)(struct address_space *);
>
> 2. Add two of the dax_operations to address_space_operations:
>
> size_t (*copy_from_iter)(struct address_space *, void *, size_t, struct iov_iter *);
> void (*flush)(struct address_space *, void *, size_t);
> (we won't need ->direct_access as an address_space op because that'll be handled a different way in the brave new world that supports non-bdev-based filesystems)

And both of them are wrong. The write_begin/write_end mistake
notwithstanding address_space ops are operations the VM can call without
knowing things like fs locking contexts. The above on the other hand
are device operations provided by the low-level driver, similar to
block_device operations. So what we need is to have a way to mount
a dax device as a file system, similar to how we support that for block
or MTD devices and can then call methods on it. For now this will
be a bit complicated because all current DAX-aware file systems also
still need block device for the metadata path, so we can't just say
you mount either a DAX or block device. But I think we should aim
for mounting a DAX device as the primary use case, and then deal
with block device emulation as a generic DAX layer thing, similarly
how we implement (bad in the rw case) block devices on top of MTD.