Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

From: Ingo Molnar
Date: Thu May 07 2015 - 13:36:56 EST



* Dan Williams <dan.j.williams@xxxxxxxxx> wrote:

> > Anyway, I did want to say that while I may not be convinced about
> > the approach, I think the patches themselves don't look horrible.
> > I actually like your "__pfn_t". So while I (very obviously) have
> > some doubts about this approach, it may be that the most
> > convincing argument is just in the code.
>
> Ok, I'll keep thinking about this and come back when we have a
> better story about passing mmap'd persistent memory around in
> userspace.

So is there anything fundamentally wrong about creating struct page
backing at mmap() time (and making sure aliased mmaps share struct
page arrays)?

Because if that is done, then the DMA agent won't even know about the
memory being persistent RAM. It's just a regular struct page, that
happens to point to persistent RAM. Same goes for all the high level
VM APIs, futexes, etc. Everything will Just Work.

It will also be relatively fast: mmap() is a relative slowpath,
comparatively.

As far as RAID is concerned: that's a relatively easy situation, as
there's only a single user of the devices, the RAID context that
manages all component devices exclusively. Device to device DMA can
use the block layer directly, i.e. most of the patches you've got here
in this series, except:

74287 C May 06 Dan Williams ( 232) ââ>[PATCH v2 09/10] dax: convert to __pfn_t

I think DAX mmap()s need struct page backing.

I think there's a simple rule: if a page is visible to user-space via
the MMU then it needs struct page backing. If it's "hidden", like
behind a RAID abstraction, it probably doesn't.

With the remaining patches a high level RAID driver ought to be able
to send pfn-to-sector and sector-to-pfn requests to other block
drivers, without any unnecessary struct page allocation overhead,
right?

As long as the pfn concept remains a clever way to reuse our
ram<->sector interfaces to implement sector<->sector IO, in the cases
where the IO has no serialization or MMU concerns, not using struct
page and using pfn_t looks natural.

The moment it starts reaching user space APIs, like in the DAX case,
and especially if it becomes user-MMU visible, it's a mistake to not
have struct page backing, I think.

(In that sense the current DAX mmap() code is already a partial
mistake.)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/