Re: [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign()

From: Kirill Tkhai
Date: Mon Nov 10 2014 - 11:36:46 EST


Ð ÐÐ, 10/11/2014 Ð 19:10 +0300, Kirill Tkhai ÐÐÑÐÑ:
> Ð ÐÐ, 10/11/2014 Ð 17:03 +0100, Peter Zijlstra ÐÐÑÐÑ:
> > On Fri, Nov 07, 2014 at 10:48:27PM -0500, Sasha Levin wrote:
> > > [ 829.539183] BUG: spinlock recursion on CPU#10, trinity-c594/11067
> > > [ 829.539203] lock: 0xffff880631dd6b80, .magic: dead4ead, .owner: trinity-c594/11067, .owner_cpu: 13
> >
> > Ooh, look at that. CPU#10 vs .owner_cpu: 13 on the _same_ task.
> >
> > One of those again :/
>
> We do not initialyse task_struct::numa_preferred_nid for INIT_TASK.
> It there no a problem?
>

I mean task_numa_find_cpu(). If a garbage is in cpumask_of_node(env->dst_nid)
and cpu is bigger than mask, the check

cpumask_test_cpu(cpu, tsk_cpus_allowed(env->p)

may be true.

So, we dereference wrong rq in task_numa_compare(). It's not rq at all.
Strange cpu may be from here. It's just a int number in a wrong memory.

A hypothesis that below may help:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 826fdf3..a2b4a8a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1376,6 +1376,9 @@ static void task_numa_find_cpu(struct task_numa_env *env,
{
int cpu;

+ if (!node_online(env->dst_nid))
+ return;
+
for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
/* Skip this CPU if the source task cannot migrate */
if (!cpumask_test_cpu(cpu, tsk_cpus_allowed(env->p)))


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/