mm 2.2.17pre6

From: Andrea Arcangeli (andrea@suse.de)
Date: Sun Jun 25 2000 - 19:10:38 EST


Alan I'm pretty worried for the free_before_allocate and the
sync_page_buffers we have now in the latest 2.2.17 pre-patches.

I proposed both (sync_page_buffers cames from 2.4.x, btw) but while
they're stable and the free_before_allocate is a fix, I believe they're
actually hurting performance. I don't have precise data unfortunately at
this moment, but I got reports of bad interactiveness and after I
suggested the patch I didn't had more feedback and I also seen some bad
bench.

I dropped both them in 2.2.17pre6aa2. I think Rik and Marcelo agrees on
the blocking try_to_free_buffers but I'm wondering about the side effects
(of setiathome doing regularly the work that instead belongs to `cp`).
Also ac22-class apparenty got harmed by that stuff that I forward ported
(infact ac22-class++ had such stuff backed out too).

About free_before_allocate I really prefer to reinsert the allocator race
and to drop the free_before_allocate logic until we'll have a smart fix (I
suggested possible smart fix in some other email).

ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.17pre6aa2/00_shrink_mmap-noblock-2
ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.17pre6/reinsert-allocator-bug-1

Then I seen that the patch included into pre5 that return success as soon
as swap_out succeed at least at once. I guessed it could makes harder to
recover from oom so I tried to run oom an my machine apparently
deadlocked. I believe it could resolve the oom in some more time, but I
used SYSRQ+E after around a minute of wait (because without the patch it
takes only a few seconds to kill the malicious tasks). Then I backed out
such patch and the oom apparently is been handled gracefully again. Could
somebody confirm this? Rik, Marcelo?

About oom I'm also glad to report that the thing that I added in the
mm-fix-3 at the end of alloc_pages:

-------------------------------------------------------------
                /*
                 * Re-check we're still low on memory after we blocked
                 * for some time. Somebody may have released lots of
                 * memory from under us while we was trying to free
                 * the pages. We check against pages_high to be sure
                 * to succeed only if lots of memory is been released.
                 */
                if (nr_free_pages > freepages.high)
                        goto ok_to_allocate;
-------------------------------------------------------------

apparently really avoids syslogd and other very innocent tasks to gets
killed after an oom. I had the idea when developing classzone and infact
the above was a plain backport of one bit of the classzone patch ;-). The
fact syslogd and friends was killed was more an allocator bug than the
lack of a proper oom killer and it seems better now.

So the below patch reverse the `ret` thing for the apparent oom problem
that I had. Again I don't care if mmap002 fails on 8mbyte box. If X would
be killed or similar then I would care much more. mmap002 generate a flood
of dirty mapped cache and 2.2.x is very bad in design in handling such
case, it's so bad that it doesn't even sync the old cache to disk (kupdate
doesn't work with the dirty mapped cache in 2.2.x).

ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.17pre6aa2/00_fix-oom-1

Then below one instead drops the warning that cause people that there's a
new bug:

ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.17pre6aa2/61_wait_sigkill-instead-1

And then there is this last patch:

ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.17pre6aa2/00_lowmem-marging-1

that raises the freepages.min for low memory boxes. The bank logic that we
have in swap_out works if we have some bank in the freelist (and we can't
remove the bank logic or we can race as it happens in 2.2.16). That could
be the reason why 8mbyte box fails mmap002. I didn't verifyed this though,
however for the algorithm of the bank considering we can have 4k*32 swap
pages in flight the min limit should not be lower than 100/200k or
similar. That last patch is the less interesting one and it's a noop in a
box with more than 26mbyte of RAM.

Comments are very appreciated. I'd suggest to wait some comment from the
mm folks before merging anything. Thanks!

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 26 2000 - 21:00:07 EST