Re: page fault scalability patch V16 [3/4]: Drop page_table_lock inhandle_mm_fault

From: Christoph Lameter
Date: Tue Feb 01 2005 - 14:08:45 EST


On Tue, 1 Feb 2005, Nick Piggin wrote:

> > pte_unmap(page_table);
> > + page_table_atomic_stop(mm);
> >
> > /*
> > * Ok, we need to copy. Oh, well..
> > */
> > if (!PageReserved(old_page))
> > page_cache_get(old_page);
> > - spin_unlock(&mm->page_table_lock);
> >
>
> I don't think you can do this unless you have done something funky that I
> missed. And that kind of shoots down your lockless COW too, although it
> looks like you can safely have the second part of do_wp_page without the
> lock. Basically - your lockless COW patch itself seems like it should be
> OK, but this hunk does not.

See my comment at the end of this message.

> I would be very interested if you are seeing performance gains with your
> lockless COW patches, BTW.

So far I have not had time to focus on benchmarking that.

> Basically, getting a reference on a struct page was the only thing I found
> I wasn't able to do lockless with pte cmpxchg. Because it can race with
> unmapping in rmap.c and reclaim and reuse, which probably isn't too good.
> That means: the only operations you are able to do lockless is when there
> is no backing page (ie. the anonymous unpopulated->populated case).
>
> A per-pte lock is sufficient for this case, of course, which is why the
> pte-locked system is completely free of the page table lock.

Introducing pte locking would allow us to go further with parallelizing
this but its another invasive procedure. I think parallelizing COW is only
possible to do reliable with some pte locking scheme. But then the
question is if the pte locking is really faster than obtaining a spinlock.
I suspect this may not be the case.

> Although I may have some fact fundamentally wrong?

The unmapping in rmap.c would change the pte. This would be discovered
after acquiring the spinlock later in do_wp_page. Which would then lead to
the operation being abandoned.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/