Re: [patch] my latest oom stuff

Linus Torvalds (torvalds@transmeta.com)
Sat, 24 Oct 1998 19:15:50 -0700 (PDT)


Btw, Andrea, if you find the CPU looping busily in "kswapd", could you try
to instrument it a bit more?

The thing is, kswapd shouldn't even _allow_ that kind of endless looping.
It forces itself to sleep at regular intervals by doing

current->state = TASK_INTERRUPTIBLE;
schedule();

inside its loop. So even if we're really low on memory, it should always
allow other processes to run for at least a fractional jiffy (after which
the timer tick will wake it up again). Certainly long enough for another
process to notice that it ran out of memory and kill itself.

However, it can easily be that the "tries" in between are too large, and
that it ends up using 99.99% of all CPU time due to not sleeping often
enough. The "tries" calculations were done based on an earlier pattern of
invocations, and I suspect "tries" is overlarge.

But I suspect that the REAL bug is that there may be code-paths that busy
loop forever if they get NULL from __get_free_pages(). That's bad. We
found and fixed one in the TCP code earlier, and the way to figure them
out is to add a printk() (or a stack trace, in fact) to the NULL return
case in __get_free_case() and see if you see an endless stream of them
when the machine locks up.

The thing is, that I suspect that your patches avoid the problems not by
being strictly correct, but by hiding the above kind of endless loops by
letting processes die before the bad behaviour gets to instantiate itself.
And that also means that the endless loop can still happen, it's just
harder to see.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/