Light-weight dynamically extended stacks

From: Matt Mackall
Date: Sun Dec 18 2005 - 19:19:06 EST


Perhaps the time for this has come and gone, but it occurred to me
that it should be relatively straightforward to make a form of
dynamically extended stacks that are appropriate to the kernel.

While we have a good handle on most of the worst stack offenders, we
can still run into trouble with pathological cases (say, symlink
recursion for XFS on a RAID built from loopback mounts over NFS
tunneled over IPSEC through GRE). So there's probably no
one-size-fits-all when it comes to stack size.

Rather than relying on guard pages and VM faults like userspace, we
can use a cooperative scheme where we "label" call paths that might be
extra deep (recursion through the block layer, network tunnels,
symlinks, etc.) with something like the following:

ret = grow_stack(function, arg, GFP_ATOMIC);

This is much like cond_resched() except for stack usage rather than
CPU usage. grow_stack() checks if we're in the danger zone for stack
usage (say 1k remaining), and if so, allocates a new stack and
swizzles the stack pointer over to it.

Then, whether we allocated a new stack page or not, we call
function(arg) to continue with our operation. When function() returns,
we deallocate the new stack (if we built one), switch back to the old
one, and propagate function's return value.

We only get into trouble with this scheme when we can't allocate a new
stack, which will only happen when we're completely out of memory[1]
and we can't sleep waiting for more. In which case, we print a warning
of impending doom and proceed to run with our current stack. This is
the same as our current behavior but with a warning message. For
safety, we can keep a small mempool of extra stacks on hand to avoid
hitting this wall when dealing with OOM in an atomic context.

We can also easily instrument the scheme to print warnings when a
process has allocated more than a couple stacks, with a hard limit to
catch any unbounded recursion.

[1] Assuming we're using 4k stacks, where fragmentation is not an
issue. But there's no reason not to use single-page stacks with this
scheme.
--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/