Re: struct page field arrangement

From: Hugh Dickins
Date: Wed Feb 28 2007 - 16:08:33 EST


On Wed, 28 Feb 2007, Jan Beulich wrote:

> A change early last year reordered struct page so that ptl overlaps not only
> private, but also mapping. Since spinlock_t can be much larger, I'm wondering
> whether there's a reason to not also overlay the space index and lru take -
> are these used for anything on page table pages?

Overlaying lru is a problem for for those architectures which use
kmem_cache_alloc for their pagetables: arm26, powerpc, sparc64 and
perhaps others (I just grepped quickly through include/asm*, didn't
follow up those who have extern functions): since slab reuses the
lru fields for its own purposes. Could perhaps be stacked somehow.

Overlaying index is fairly straightforward: the index field is fair
game. In my original patches I did overlay index, but Andrew was
strongly averse to the way I was doing it, and scaled things back,
to private alone if I remember rightly, then relaxed a little to
include mapping too. Way back then I made up a patch to overlay
index too (when I saw Fedora going out with CONFIG_DEBUG_SPINLOCK),
but I could never get it into a form where I felt it would satisfy
Andrew; and grew increasingly dissatisfied with that approach myself.

I don't think further overlaying is the right answer really.
But I do think it's a scandal that the size of struct page (in a
DEBUG_SPINLOCK system) is governed by such a minority use of the
struct page. Lacking a satisfying answer, I've just let it drift
on until someone notices and complains.

kmalloc a separate spinlock structure when it's too big to fit in?
Not such a good idea, since then there will tend to be false sharing
of cachelines between them: simpler just to disable SPLIT_PTLOCK in
that case.

I'm not happy with the status quo, but I don't know the right answer:
perhaps allow pagetable pages to use an undebugged spinlock variant?

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/