Re: [PATCH 1/2] sched/wait: Break up long wake list walk

From: Linus Torvalds
Date: Fri Aug 18 2017 - 13:48:30 EST


On Fri, Aug 18, 2017 at 9:53 AM, Liang, Kan <kan.liang@xxxxxxxxx> wrote:
>
>> On Fri, Aug 18, 2017 Mel Gorman wrote:
>>
>> That indicates that it may be a hot page and it's possible that the page is
>> locked for a short time but waiters accumulate. What happens if you leave
>> NUMA balancing enabled but disable THP?
>
> No, disabling THP doesn't help the case.

Interesting. That particular code sequence should only be active for
THP. What does the profile look like with THP disabled but with NUMA
balancing still enabled?

Just asking because maybe that different call chain could give us some
other ideas of what the commonality here is that triggers out
behavioral problem.

I was really hoping that we'd root-cause this and have a solution (and
then apply Tim's patch as a "belt and suspenders" kind of thing), but
it's starting to smell like we may have to apply Tim's patch as a
band-aid, and try to figure out what the trigger is longer-term.

Linus