Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux)

From: Andrew Lutomirski
Date: Thu May 19 2011 - 10:17:15 EST


I just booted 2.6.38.6 with exactly two patches applied. Config was
the same as I emailed yesterday. Userspace is F15. First was
"aesni-intel: Merge with fpu.ko" because dracut fails to boot my
system without it. Second was this (sorry for whitespace damage):

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0665520..3f44b81 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -307,7 +307,7 @@ static void set_reclaim_mode(int priority, struct
scan_control *sc,
*/
if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
sc->reclaim_mode |= syncmode;
- else if (sc->order && priority < DEF_PRIORITY - 2)
+ else if ((sc->order && priority < DEF_PRIORITY - 2) ||
priority <= DEF_PRIORITY / 3)
sc->reclaim_mode |= syncmode;
else
sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
@@ -1342,10 +1342,6 @@ static inline bool
should_reclaim_stall(unsigned long nr_taken,
if (current_is_kswapd())
return false;

- /* Only stall on lumpy reclaim */
- if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
- return false;
-
/* If we have relaimed everything on the isolated list, no stall */
if (nr_freed == nr_taken)
return false;

I started GNOME and Firefox, enabled swap, and ran test_mempressure.sh
1500 1400 1. The system quickly gave the attached oops.

The oops was the ud2 here:

0xffffffff810d251b <+215>: mov -0x28(%rbx),%rax
0xffffffff810d251f <+219>: test $0x40,%al
0xffffffff810d2521 <+221>: je 0xffffffff810d2525 <shrink_page_list+225>
0xffffffff810d2523 <+223>: ud2

Please let me know what the next test to run is.

--Andy

Attachment: IMG_20110519_094454.jpg
Description: JPEG image