Re: 3.10.16 cgroup_mutex deadlock

From: Shawn Bohrer
Date: Wed Nov 20 2013 - 17:47:54 EST


On Tue, Nov 19, 2013 at 10:55:18AM +0800, Li Zefan wrote:
> > Thanks Tejun and Hugh. Sorry for my late entry in getting around to
> > testing this fix. On the surface it sounds correct however I'd like to
> > test this on top of 3.10.* since that is what we'll likely be running.
> > I've tried to apply Hugh's patch above on top of 3.10.19 but it
> > appears there are a number of conflicts. Looking over the changes and
> > my understanding of the problem I believe on 3.10 only the
> > cgroup_free_fn needs to be run in a separate workqueue. Below is the
> > patch I've applied on top of 3.10.19, which I'm about to start
> > testing. If it looks like I botched the backport in any way please
> > let me know so I can test a propper fix on top of 3.10.19.
> >
>
> You didn't move css free_work to the dedicate wq as Tejun's patch does.
> css free_work won't acquire cgroup_mutex, but when destroying a lot of
> cgroups, we can have a lot of css free_work in the workqueue, so I'd
> suggest you also use cgroup_destroy_wq for it.

Well, I didn't move the css free_work, but I did test the patch I
posted on top of 3.10.19 and I am unable to reproduce the lockup so it
appears my patch was sufficient for 3.10.*. Hopefully we can get this
fix applied and backported into stable.

Thanks,
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/