Re: [PATCH] sched: fix another race when reading /proc/sched_debug

From: Ingo Molnar
Date: Tue Dec 16 2008 - 07:24:25 EST



* Li Zefan <lizf@xxxxxxxxxxxxxx> wrote:

> Li Zefan wrote:
> >> i merged it up in tip/master, could you please check whether it's ok?
> >>
> >
> > Sorry, though this patch avoids accessing a half-created cgroup, but I found
> > current code may access a cgroup which has been destroyed.
> >
> > The simplest fix is to take cgroup_lock() before for_each_leaf_cfs_rq.
> >
> > Could you revert this patch and apply the following new one? My box has
> > survived for 16 hours with it applied.
> >
>
> Hi, Ingo
>
> Can we have this bug fixed for 2.6.28 using this patch ? This patch is
> the simplest fix and has been fully tested.

the mutex used by cgroup_lock() is pretty crappy to nest inside the
runqueue lock:

BUG: sleeping function called from invalid context at kernel/mutex7
in_atomic(): 0, irqs_disabled(): 1, pid: 1790, name: cat
2 locks held by cat/1790:
#0: (&p->lock){--..}, at: [<c02a6328>] seq_read+0x25/0x2b8
#1: (tasklist_lock){..--}, at: [<c021f5bc>]
sched_debug_show+0xaa7/0xe90
Call Trace:
[<c0222612>] __might_sleep+0xd6/0xdd
[<c05ff1eb>] mutex_lock_nested+0x1d/0x245
[<c021f88b>] ? sched_debug_show+0xd76/0xe90
[<c0256223>] cgroup_lock+0xf/0x11
[<c021f893>] sched_debug_show+0xd7e/0xe90
[<c0248696>] ? __lock_acquire+0x637/0x69d
[<c028e7cd>] ? check_object+0x111/0x18c
[<c028f435>] ? kmem_cache_alloc+0x70/0xa5
[<c02a6355>] ? seq_read+0x52/0x2b8
[<c0246ee2>] ? trace_hardirqs_on_caller+0x105/0x13d
[<c0246f25>] ? trace_hardirqs_on+0xb/0xd
[<c02a6355>] ? seq_read+0x52/0x2b8
[<c02a63f7>] seq_read+0xf4/0x2b8
[<c02a6303>] ? seq_read+0x0/0x2b8
[<c02c411a>] proc_reg_read+0x60/0x74

so i had to remove your patch.

also, while looking at what cgroup_lock does - it's a trivial wrapper:

void cgroup_lock(void)
{
mutex_lock(&cgroup_mutex);
}

why isnt that done explicitly in all usage sites? If cgroup-unaware code
ever has to take the cgroup lock outside of CONFIG_CGROUPS, that's a code
structure problem.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/