RE: [PATCHv3 0/2] mm: map few pages around fault address if they are in page cache

From: Wilcox, Matthew R
Date: Thu Feb 27 2014 - 22:53:28 EST


I think the psbfb case is just horribly broken; they probably want to populate the entire VMA at mmap time rather than fault time. It'll be less code for them.

ttm is more nuanced, and there're one or two other graphics drivers that have similar requirements of "faulting around". But all of the ones that try it have the nasty feature of potentially faulting in some pages that weren't the one that actually faulted on, and then failing before faulting in the requested one, then returning to userspace.

Granted, this is a pretty rare case. You'd have to be incredibly low on memory to fail to allocate a page table page. But it can happen, and shouldn't.

So I was thinking about a helper that these drivers could use to "fault around" in ->fault, then Kirill let me in on ->map_pages, and I think this way could work too.

________________________________________
From: Kirill A. Shutemov [kirill@xxxxxxxxxxxxx]
Sent: February 27, 2014 4:10 PM
To: Linus Torvalds
Cc: Kirill A. Shutemov; Andrew Morton; Mel Gorman; Rik van Riel; Andi Kleen; Wilcox, Matthew R; Dave Hansen; Alexander Viro; Dave Chinner; Ning Qu; linux-mm; linux-fsdevel; Linux Kernel Mailing List
Subject: Re: [PATCHv3 0/2] mm: map few pages around fault address if they are in page cache

On Thu, Feb 27, 2014 at 01:28:22PM -0800, Linus Torvalds wrote:
> On Thu, Feb 27, 2014 at 11:53 AM, Kirill A. Shutemov
> <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> > Here's new version of faultaround patchset. It took a while to tune it and
> > collect performance data.
>
> Andrew, mind taking this into -mm with my acks? It's based on top of
> Kirill's cleanup patches that I think are also in your tree.
>
> Kirill - no complaints from me. I do have two minor issues that you
> might satisfy, but I think the patch is fine as-is.
>
> The issues/questions are:
>
> (a) could you test this on a couple of different architectures? Even
> if you just have access to intel machines, testing it across a couple
> of generations of microarchitectures would be good. The reason I say
> that is that from my profiles, it *looks* like the page fault costs
> are relatively higher on Ivybridge/Haswell than on some earlier
> uarchs.

These numbers were from Ivy Bridge.
I'll bring some numbers for Westmere and Haswell.

> (b) I suspect we should try to strongly discourage filesystems from
> actually using map_pages unless they use the standard
> filemap_map_pages function as-is. Even with the fairly clean
> interface, and forcing people to use "do_set_pte()", I think the docs
> might want to try to more explicitly discourage people from using this
> to do their own hacks..

We would need ->map_pages() at least for shmem/tmpfs. It should be
benefitial there.

Also Matthew noticed that some drivers do ugly hacks like fault in whole
VMA on first page fault. IIUC, it's for performance reasons. See
psbfb_vm_fault() or ttm_bo_vm_fault().

I thought it could be reasonable to have ->map_pages() there and do VMA
population get_user_pages() on mmap() instead.

What do you think?

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/