Re: [RFC] Page table sharing

From: Daniel Phillips (phillips@bonn-fries.net)
Date: Mon Feb 18 2002 - 21:55:48 EST


On February 19, 2002 03:35 am, Linus Torvalds wrote:
> On Tue, 19 Feb 2002, Daniel Phillips wrote:
> > >
> > > Which implies that the swapper needs to look up all mm's some way anyway,
> >
> > Ick. With rmap this is straightforward, but without, what?
>
> It is not at ALL straightforward with rmap either.
>
> Remember: one of the big original _points_ of the pmd sharing was to avoid
> having to do the rmap overhead for shared page tables. The fact that it
> works without rmap too was just a nice bonus, and makes apples-to-apples
> comparisons possible.
>
> So if you do the rmap overhead even when sharing, you're toast. No more
> shared pmd's.
>
> > Maybe page tables should be unshared on swapin/out after all, only on arches
> > that need special tlb treatment, or until we have rmap.
>
> There is no "or until we have rmap". It doesn't help. All the same issues
> hold - if you have to invalidate multiple mm's, you have to find them all.
> That's the same whether you have rmap or not, and is a fundamental issue
> with sharing pmd's.
>
> Dang, I should have noticed before this.
>
> Note that "swapin" is certainly not the problem - we don't need to swap
> the thing into all mm's at the same time, so if a unshare happens just
> before/after the swapin and the unshared process doesn't get the thing,
> we're still perfectly fine.
>
> In fact, swapin is not even a spacial case. It's just the same as any
> other page fault - we can continue to share page tables over "read-only"
> page faults, and even that is _purely_ an optimization (yeah, it needs
> some trivial "cmpxchg()" magic on the pmd to work, but it has no TLB
> invalidation issues or anything really complex like that).
>
> The only problem is swapout. And "swapout()" is always a problem, in fact.
> It's always been special, because it is quite fundamentally the only VM
> operation that ever is "nonlocal". We've had tons of races with swapout
> over time, it's always been the nastiest VM operation by _far_ when it
> comes to page table coherency.
>
> We can, of course, introduce a "pmd-rmap" thing, with a pointer to a
> circular list of all mm's using that pmd inside the "struct page *" of the
> pmd.

Yes, exactly my thought.

> Right now the rmap patches just make the pointer point directly to
> the one exclusive mm that holds the pmd, right?

Correct.

> (This could be a good "gradual introduction to some of the rmap data
> structures" thing too).

Yup.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 23 2002 - 21:00:17 EST