Re: [tip:sched/numa] sched/numa: Introduce sys_numa_{t,m}bind()

From: Rik van Riel
Date: Fri May 18 2012 - 11:45:50 EST


On 05/18/2012 06:42 AM, tip-bot for Peter Zijlstra wrote:

Now that we have a NUMA process scheduler, provide a syscall
interface for finer granularity NUMA balancing. In particular
this allows setting up NUMA groups of threads and vmas within
a process.

For this we introduce two new syscalls:

sys_numa_tbind(int tig, int ng_id, unsigned long flags);

Bind a thread to a numa group, query its binding or create a new group:

sys_numa_tbind(tid, -1, 0); // create new group, return new ng_id
sys_numa_tbind(tid, -2, 0); // returns existing ng_id
sys_numa_tbind(tid, ng_id, 0); // set ng_id

I am not convinced this is the right way forward.

While this may work well for programs written in languages
with pointers, and for virtual machines, I do not see how
eg. a JVM could provide useful hints to the kernel, because
the Java program running on top has no idea about the
memory addresses of its objects, and the Java language has
no way to hint which thread will be the predominant user
of an object.

I like your code for handling smaller processes in NUMA
systems, but we do need to have a serious discussion on
how to handle processes that do not fit in one node.

The more I think about it, the more Andrea's code looks
like it might be the more flexible way forward.

Another topic to discuss is whether we want lazy
migrate-on-fault, or if we want to keep the program
spend its time running, using another (idle) core to
do the migration in the background.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/