Re: Deadlock possibly caused by too_many_isolated.

From: Minchan Kim
Date: Mon Oct 18 2010 - 21:15:16 EST


On Tue, Oct 19, 2010 at 9:57 AM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> > I think there are two bugs here.
>> > The raid1 bug that Torsten mentions is certainly real (and has been around
>> > for an embarrassingly long time).
>> > The bug that I identified in too_many_isolated is also a real bug and can be
>> > triggered without md/raid1 in the mix.
>> > So this is not a 'full fix' for every bug in the kernel :-), but it could
>> > well be a full fix for this particular bug.
>> >
>>
>> Can we just delete the too_many_isolated() logic?  (Crappy comment
>> describes what the code does but not why it does it).
>
> if my remember is correct, we got bug report that LTP may makes misterious
> OOM killer invocation about 1-2 years ago. because, if too many parocess are in
> reclaim path, all of reclaimable pages can be isolated and last reclaimer found
> the system don't have any reclaimable pages and lead to invoke OOM killer.
> We have strong motivation to avoid false positive oom. then, some discusstion
> made this patch.
>
> if my remember is incorrect, I hope Wu or Rik fix me.

AFAIR, it's right.

How about this?

It's rather aggressive throttling than old(ie, it considers not lru
type granularity but zone )
But I think it can prevent unnecessary OOM problem and solve deadlock problem.


diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f12ad18..acd6a65 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1961,6 +1961,21 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
return alloc_flags;
}

+/*
+ * Are there way too many processes are reclaiming this zone?
+ */
+static int too_many_isolated_zone(struct zone *zone)
+{
+ unsigned long inactive, isolated;
+
+ inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
+ zone_page_state(zone, NR_INACTIVE_ANON);
+ isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
+ zone_page_state(zone, NR_ISOLATED_ANON);
+
+ return isolated > inactive;
+}
+
static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
struct zonelist *zonelist, enum zone_type high_zoneidx,
@@ -2054,10 +2069,11 @@ rebalance:
goto got_pg;

/*
- * If we failed to make any progress reclaiming, then we are
- * running out of options and have to consider going OOM
+ * If we failed to make any progress reclaiming and there aren't
+ * many parallel reclaiming, then we are unning out of options and
+ * have to consider going OOM
*/
- if (!did_some_progress) {
+ if (!did_some_progress && !too_many_isolated_zone(preferred_zone)) {
if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
if (oom_killer_disabled)
goto nopage;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c5dfabf..f2109af 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1129,31 +1129,6 @@ int isolate_lru_page(struct page *page)
}

/*
- * Are there way too many processes in the direct reclaim path already?
- */
-static int too_many_isolated(struct zone *zone, int file,
- struct scan_control *sc)
-{
- unsigned long inactive, isolated;
-
- if (current_is_kswapd())
- return 0;
-
- if (!scanning_global_lru(sc))
- return 0;
-
- if (file) {
- inactive = zone_page_state(zone, NR_INACTIVE_FILE);
- isolated = zone_page_state(zone, NR_ISOLATED_FILE);
- } else {
- inactive = zone_page_state(zone, NR_INACTIVE_ANON);
- isolated = zone_page_state(zone, NR_ISOLATED_ANON);
- }
-
- return isolated > inactive;
-}
-
-/*
* TODO: Try merging with migrations version of putback_lru_pages
*/
static noinline_for_stack void
@@ -1290,15 +1265,6 @@ shrink_inactive_list(unsigned long nr_to_scan,
struct zone *zone,
unsigned long nr_anon;
unsigned long nr_file;

- while (unlikely(too_many_isolated(zone, file, sc))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
-
- /* We are about to die and free our memory. Return now. */
- if (fatal_signal_pending(current))
- return SWAP_CLUSTER_MAX;
- }
-
-
lru_add_drain();
spin_lock_irq(&zone->lru_lock);




--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/