Re: [RFC][PATCH 00/26] sched/numa

From: Peter Zijlstra
Date: Mon Mar 19 2012 - 08:00:02 EST


On Mon, 2012-03-19 at 13:42 +0200, Avi Kivity wrote:
> > Now if you want to be able to scan per-thread, you need per-thread
> > page-tables and I really don't want to ever see that. That will blow
> > memory overhead and context switch times.
>
> I thought of only duplicating down to the PDE level, that gets rid of
> almost all of the overhead.

You still get the significant CR3 cost for thread switches.

[ /me grabs the SDM to find that PDE is what we in Linux call the pmd ]

That'll cut the memory overhead down but also the severely impact the
accuracy.

Also, I still don't see how such a scheme would correctly identify
per-cpu memory in guest kernels. While less frequent its still very
common to do remote access to per-cpu data. So even if you did page
granularity you'd get a fair amount of pages that are accesses by all
threads (vcpus) in the scan interval, even thought they're primarily
accesses by just one.

If you go to pmd level you get even less information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/