Re: [PATCH 08 of 11] anon-vma-rwsem

From: Nick Piggin
Date: Tue May 13 2008 - 08:13:46 EST


On Thursday 08 May 2008 10:38, Robin Holt wrote:
> On Wed, May 07, 2008 at 02:36:57PM -0700, Linus Torvalds wrote:
> > On Wed, 7 May 2008, Andrea Arcangeli wrote:
> > > I think the spinlock->rwsem conversion is ok under config option, as
> > > you can see I complained myself to various of those patches and I'll
> > > take care they're in a mergeable state the moment I submit them. What
> > > XPMEM requires are different semantics for the methods, and we never
> > > had to do any blocking I/O during vmtruncate before, now we have to.
> >
> > I really suspect we don't really have to, and that it would be better to
> > just fix the code that does that.
>
> That fix is going to be fairly difficult. I will argue impossible.
>
> First, a little background. SGI allows one large numa-link connected
> machine to be broken into seperate single-system images which we call
> partitions.
>
> XPMEM allows, at its most extreme, one process on one partition to
> grant access to a portion of its virtual address range to processes on
> another partition. Those processes can then fault pages and directly
> share the memory.
>
> In order to invalidate the remote page table entries, we need to message
> (uses XPC) to the remote side. The remote side needs to acquire the
> importing process's mmap_sem and call zap_page_range(). Between the
> messaging and the acquiring a sleeping lock, I would argue this will
> require sleeping locks in the path prior to the mmu_notifier invalidate_*
> callouts().

Why do you need to take mmap_sem in order to shoot down pagetables of
the process? It would be nice if this can just be done without
sleeping.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/