Re: [PATCH 00/28] Swap over NFS -v16

From: Peter Zijlstra
Date: Fri Feb 29 2008 - 06:52:23 EST



On Wed, 2008-02-27 at 10:05 +0200, Pekka Enberg wrote:
> Hi Peter,
>
> On Wed, Feb 27, 2008 at 9:58 AM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> > > 1/ I note there is no way to tell if memory returned by kmalloc is
> > > from the emergency reserve - which contrasts with alloc_page
> > > which does make that information available through page->reserve.
> > > This seems a slightly unfortunate aspect of the interface.
> >
> > Yes, but alas there is no room to store such information in kmalloc().
> > That is, in a sane way. I think it was Daniel Phillips who suggested
> > encoding it in the return pointer by flipping the low bit - but that is
> > just too ugly and breaks all current kmalloc sites to boot.
>
> Why can't you add a kmem_is_emergency() to SLUB that looks up the
> cache/slab/page (whatever is the smallest unit of the emergency pool
> here) for the object and use that?

I made page->reserve into PG_emergency and made that bit stick for the
lifetime of that page allocation. I then made kmem_is_emergency() look
up the head page backing that allocation's slab and return
PageEmergency().

This gives a consistent kmem_is_emergency() - that is if during the
lifetime of the kmem allocation it returns true once, it must return
true always.

You can then, using this properly, push the accounting into
kmalloc_reserve() and kfree_reserve() (and
kmem_cache_{alloc,free}_reserve).

Which yields very pretty code all round. (can make public if you like to
see..)

However...

This is a stricter model than I had before, and has one ramification I'm
not entirely sure I like.

It means the page remains a reserve page throughout its lifetime, which
means the slab remains a reserve slab throughout its lifetime. Therefore
it may never be used for !reserve allocations. Which in turn generates
complexities for the partial list.

In my previous model I had the reserve accounting external to the
allocation, which relaxed the strict need for consistency here, and I
dropped the reserve status once we were above the page limits again.

I managed to complicate the SLUB patch with this extra constraint, by
checking reserve against PageEmergency() when scanning the partial list,
but gave up on SLAB.

Does this sound like something I should pursuit? I feel it might
complicate the slab allocators too much..

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/