Re: Linux 2.6.25-rc2

From: Linus Torvalds
Date: Tue Feb 19 2008 - 11:21:25 EST




On Tue, 19 Feb 2008, Pekka Enberg wrote:
>
> Hmm. The barrier() in slab_free() looks fishy. The comment says it's
> there to make sure we've retrieved c->freelist before c->page but then
> it uses a _compiler barrier_ which doesn't affect the CPU and the
> reads may still be re-ordered... Not sure if that matters here though.

No, no. The comment says that it's purely there to serialize an
*interrupt*, and as such, a compiler-only barrier is sufficient (or the
comment is wrong).

Interrupts are "totally ordered" within a cpu (of course, in theory a CPU
might have speculative work etc reordering, but the CPU also guarantees
that interrupt acts _as_if_ it was exact), so a compiler barrier is
sufficient.

Of course, if we're talking about interrupts on another CPU, that's a
different issue, but the fact is, in that case it's not about interrupts
any more (might as well be other code just running normally on another
CPU), and a barrier doesn't help, it needs real locking.

So that barrier is fine per se. Of course, the whole code (and/or just the
comment!) may be buggered, but any CPU SMP-aware barriers shouldn't be
relevant.

What's much more likely to be an issue is simply the fact that since the
fastpath now accesses the per-cpu freelist without any locking, if there
is *any* sequence what-so-ever that does it from another CPU and assumes
the old locking behaviour, the list will be corrupted. And from a quick
look-through, I certainly cannot guarantee that isn't the case.

There's still a lot of cases that do direct assignments to "c->freelist"
without using a guaranteed atomic sequence. They *should* be safe if it's
guaranteed that
(a) they always run with interrupts disabled
AND
(b) 'c' is _always_ the "current CPU" list

but I can't quickly see that guarantee for either.

I'd happily just revert this thing, but it would be really good to have
confirmation that it seems to matter. But Torsten's partial bisection
seems to say that the quicklist thing went into -mm before the crash even
started.

So:
- it might be something else entirely
- it might still be the local cmpxchg, just Torsten didn't happen to
notice it until later.
- it might still be the local cmpxchg, but something else changed its
patterns to actually make it start triggering.

and in general I don't think we should revert it unless we have stronger
indications that it really is the problem (eg somebody finds the actual
bug, or a reporter can confirm that it goes away when the local cmpxchg
optimization is disabled).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/