Re: Unending loop in __alloc_pages_slowpath following OOM-kill; rfc:patch.

From: KOSAKI Motohiro
Date: Tue May 24 2011 - 04:42:00 EST


>> I'm sorry I missed this thread long time.
>
> No problem. It would be better than not review.

thx.


>> In this case, I think we should call drain_all_pages(). then following
>> patch is better.
>
> Strictly speaking, this problem isn't related to drain_all_pages.
> This problem caused by lru empty but I admit it could work well if
> your patch applied.
> So yours could help, too.
>
>> However I also think your patch is valuable. because while the task is
>> sleeping in wait_iff_congested(), an another task may free some pages.
>> thus, rebalance path should try to get free pages. iow, you makes sense.
>
> Yes.
> Off-topic.
> I would like to move cond_resched below get_page_from_freelist in
> __alloc_pages_direct_reclaim. Otherwise, it is likely we can be stolen
> pages to other processes.
> One more benefit is that if it's apparently OOM path(ie,
> did_some_progress = 0), we can reduce OOM kill latency due to remove
> unnecessary cond_resched.

I agree. Can you please mind to send a patch?


>> So, I'd like to propose to merge both your and my patch.
>
> Recently, there was discussion on drain_all_pages with Wu.
> He saw much overhead in 8-core system, AFAIR.
> I Cced Wu.
>
> How about checking per-cpu before calling drain_all_pages() than
> unconditional calling?
> if (per_cpu_ptr(zone->pageset, smp_processor_id())
> drain_all_pages();
>
> Of course, It can miss other CPU free pages. But above routine assume
> local cpu direct reclaim is successful but it failed by per-cpu. So I
> think it works.

Can you please tell me previous discussion url or mail subject?
I mean, if it is costly and performance degression risk, we don't have to
take my idea.

Thanks.


>
> Thanks for good suggestion and Reviewed-by, KOSAKI.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/