Re: [RFC][PATCH 1/4] sched: Fix a race between __kthread_bind() and sched_setaffinity()

From: Tejun Heo
Date: Fri Aug 07 2015 - 12:10:11 EST


Hello, Peter.

On Fri, Aug 07, 2015 at 05:59:54PM +0200, Peter Zijlstra wrote:
> > So, the problem there is that __kthread_bind() doesn't grab the same
> > lock that the syscall side grabs but workqueue used
> > set_cpus_allowed_ptr() which goes through the rq locking, so as long
> > as the check on syscall side is movied inside rq lock, it should be
> > fine.
>
> Currently neither site uses any lock, and that is what the patch fixes
> (it uses the per-task ->pi_lock instead of the rq->lock, but that is
> immaterial).

Yeap, the testing on the syscall side should definitely be moved
inside rq->lock.

> What matters though is that you now must hold a scheduler lock while
> setting PF_NO_SETAFFINITY. In order to avoid spreading that knowledge
> around I've taught kthread_bind*() about this and made the workqueue
> code use that API (rather than having the workqueue code take scheduler
> locks).

So, as long as PF_NO_SETAFFINITY is set before the task sets its
affinity to its target holding the rq lock, it should still be safe.

> Hmm.. a better solution. Have the worker thread creation call
> kthread_bind_mask() before attach_to_pool() and have attach_to_pool()
> keep using set_cpus_allowed_ptr(). Less ugly.

Yeah, that works too. About the same effect.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/