Re: [PATCH 7/8] sched: Split accounting of NUMA hinting faults thatpass two-stage filter

From: Peter Zijlstra
Date: Fri Jun 28 2013 - 06:33:16 EST


On Fri, Jun 28, 2013 at 03:42:45PM +0530, Srikar Dronamraju wrote:
> > >
> > > > Ideally it would be possible to distinguish between NUMA hinting faults
> > > > that are private to a task and those that are shared. This would require
> > > > that the last task that accessed a page for a hinting fault would be
> > > > recorded which would increase the size of struct page. Instead this patch
> > > > approximates private pages by assuming that faults that pass the two-stage
> > > > filter are private pages and all others are shared. The preferred NUMA
> > > > node is then selected based on where the maximum number of approximately
> > > > private faults were measured.
> > >
> > > Should we consider only private faults for preferred node?
> >
> > I don't think so; its optimal for the task to be nearest most of its pages;
> > irrespective of whether they be private or shared.
>
> Then the preferred node should have been chosen based on both the
> private and shared faults and not just private faults.

Oh duh indeed. I totally missed it did that. Changelog also isn't giving
rationale for this. Mel?

> >
> > > I would think if tasks have shared pages then moving all tasks that share
> > > the same pages to a node where the share pages are around would be
> > > preferred. No?
> >
> > Well no; not if there's only 5 shared pages but 1024 private pages.
>
> Yes, agree, but should we try to give the shared pages some additional weightage?

Yes because you'll get 1/n amount of this on shared pages for threads --
other threads will contend for the same PTE fault. And no because for
inter process shared memory they'll each have their own PTE. And maybe
because even for the threaded case its hard to tell how many threads
will actually contend for that one PTE.

Confused enough? :-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/