Re: [RFC] mm/vmscan.c: avoid possible long latency caused by too_many_isolated()

From: Yu Zhao
Date: Thu Apr 22 2021 - 16:57:39 EST


On Thu, Apr 22, 2021 at 2:38 PM Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 4/22/21 1:30 PM, Yu Zhao wrote:
> >
> > HZ/10 is purely arbitrary but that's ok because we assume normally
> > nobody hits it. If you do often, we need to figure out why and how not
> > to hit it so often.
> >
>
> Perhaps Zhengjun can test the proposed fix in his test case to see if the timeout value
> makes any difference.

Shakeel has another test to stress page reclaim to a point that the
kernel can livelock for two hours because of a large number of
concurrent reclaimers stepping on each other. He might be able to
share that test with you in case you are interested.

Also it's Hugh who first noticed that migration can isolate many pages
and in turn block page reclaim. He might be able to help too, in case
you are interested in the interaction between migration and page
reclaim.

Thanks.