Re: [PATCH 1/2] sched/wait: Break up long wake list walk

From: Andi Kleen
Date: Tue Aug 22 2017 - 17:24:14 EST


On Tue, Aug 22, 2017 at 04:08:52PM -0500, Christopher Lameter wrote:
> On Tue, 22 Aug 2017, Andi Kleen wrote:
>
> > We only see it on 4S+ today. But systems are always getting larger,
> > so what's a large system today, will be a normal medium scale system
> > tomorrow.
> >
> > BTW we also collected PT traces for the long hang cases, but it was
> > hard to find a consistent pattern in them.
>
> Hmmm... Maybe it would be wise to limit the pages autonuma can migrate?
>
> If a page has more than 50 refcounts or so then dont migrate it. I think
> high number of refcounts and a high frequewncy of calls are reached in
> particular for pages of the c library. Attempting to migrate those does
> not make much sense anyways because the load may shift and another
> function may become popular. We may end up shifting very difficult to
> migrate pages back and forth.

I believe in this case it's used by threads, so a reference count limit
wouldn't help.

If migrating code was a problem I would probably rather just disable
migration of read-only pages.

-Andi