Re: [PATCH 03/29] mm: slb: add knowledge of reserve pages

From: Daniel Phillips
Date: Sat Dec 15 2007 - 05:11:19 EST


On Friday 14 December 2007 14:51, I wrote:
> On Friday 14 December 2007 07:39, Peter Zijlstra wrote:
> Note that false sharing of slab pages is still possible between two
> unrelated writeout processes, both of which obey rules for their own
> writeout path, but the pinned combination does not. This still
> leaves a hole through which a deadlock may slip.

Actually, no it doesn't. It in fact does not matter how many unrelated
writeout processes, block devices, whatevers share a slab cache.
Sufficient reserve pages must have been made available (in a perfect
work, by adding extra pages to the memalloc reserve on driver
initialization, in the real world just by having a big memalloc
reserve) to populate the slab up to the sum of the required objects for
all memalloc users sharing the cache.

So I think this slab technique of yours is fundamentally sound, that is
to say, adding a new per-slab flag to keep unbounded numbers of slab
objects with unbounded lifetimes from mixing with the bounded number of
slab objects with bounded lifetimes.

Ponder. OK, here is another issue. Suppose a driver expands the
memalloc reserve by the X number of pages it needs on initialization,
and shrinks it by the same amount on removal, as is right and proper.
The problem is, less than the number of slab pages that got pulled into
slab on behalf of the removed driver may be freed (or made freeable)
back to the global reserve, due to page sharing with an unrelated user.
In theory, the global reserve could be completely depleted by this slab
fragmentation.

OK, that is like the case that I mistakenly raised in the previous mail,
though far less likely to occur, because driver removals are relatively
rare and so would be a fragmentation case so severe as to cause global
reserve depletion.

Even so, if this possibility bothers anybody, it is fairly easy to plug
the hole: associate each slab with a given memalloc user instead of
just having one bit to classify users. So unrelated memalloc users
would never share a slab, no false sharing, everybody happy. The cost:
a new pointer field per slab and a few additional lines of code.

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/