Re: [PATCH] mm: list_lru: fix almost infinite loop causing effective livelock

From: Linus Torvalds
Date: Wed Oct 30 2013 - 15:49:59 EST


On Wed, Oct 30, 2013 at 7:16 AM, Russell King - ARM Linux
<linux@xxxxxxxxxxxxxxxx> wrote:
>
> So, if *nr_to_walk was zero when this function was entered, that means
> we're wanting to operate on (~0UL)+1 objects - which might as well be
> infinite.
>
> Clearly this is not correct behaviour. If we think about the behaviour
> of this function when *nr_to_walk is 1, then clearly it's wrong - we
> decrement first and then test for zero - which results in us doing
> nothing at all. A post-decrement would give the desired behaviour -
> we'd try to walk one object and one object only if *nr_to_walk were
> one.
>
> It also gives the correct behaviour for zero - we exit at this point.

Good analysis.

HOWEVER.

I actually think even your version is very dangerous, because we pass
in the *address* to that count, and the only real reason to do that is
because we might call it in a loop, and we want the function to update
that count.

And even your version still underflows from 0 to really-large-count.
It *returns* when underflow happens, but you end up with the counter
updated to a large value, and then anybody who uses it later would be
screwed.

See, for example, the inline list_lru_walk() function in <linux/list_lru.h>

So I think we should either change that "unsigned long" to just
"long", and then check for "<= 0" (like list_lru_walk() already does),
or we should do

if (!*nr_to_walk)
break;
--*nr_to_walk;

to make sure that we never do that underflow.

I will modify your patch to do the latter, since it's the smaller
change, but I suspect we should think about making that thing signed.

Hmm?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/