Re: [patch 3/3] mm: fault handler to replace nopage and populate

From: Thomas Hellstrom
Date: Mon Oct 09 2006 - 09:39:02 EST


Nick Piggin wrote:
On Mon, Oct 09, 2006 at 10:07:50PM +1000, Benjamin Herrenschmidt wrote:

On Mon, 2006-10-09 at 13:58 +0200, Nick Piggin wrote:

The VM won't see that you have struct pages backing the ptes, and won't
do the right refcounting or rmap stuff... But for file backed mappings,
all the critical rmap stuff should be set up at mmap time, so you might
have another option to simply always do the nopfn thing, as far as the
VM is concerned (ie. even when you do have a struct page)

Any reason why it wouldn't work to flip that bit on the first no_page()
after a migration ? A migration always involves destroying all PTEs and
is done with a per-object mutex held that no_page() takes too, so we can
be pretty sure that the first nopage can set that bit before any PTE is
actually inserted in the mapping after all the previous ones have been
invalidated... That would avoid having to walk the vma's.


Ok I guess that would work. I was kind of thinking that one needs to
hold the mmap_sem for writing when changing the flags, but so long
as everyone *else* does, then I guess you can get exclusion from just
the read lock. And your per-object mutex would prevent concurrent
nopages from modifying it.

Wouldn't that confuse concurrent readers?

Could it be an option to make it safe for the fault handler to temporarily drop the mmap_sem read lock given that some conditions TBD are met?
In that case it can retake the mmap_sem write lock, do the VMA flags modifications, downgrade and do the pte modifications using a helper, or even use remap_pfn_range() during the time the write lock is held?

/Thomas






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/