Re: [PATCH 2/4] sched: implement __set_cpus_allowed()

From: Tejun Heo
Date: Mon May 31 2010 - 05:55:58 EST


Hello,

On 05/31/2010 10:01 AM, Peter Zijlstra wrote:
> On Thu, 2010-05-13 at 12:48 +0200, Tejun Heo wrote:
>> Concurrency managed workqueue needs to be able to migrate tasks to a
>> cpu which is online but !active for the following two purposes.
>>
>> p1. To guarantee forward progress during cpu down sequence. Each
>> workqueue which could be depended upon during memory allocation
>> has an emergency worker task which is summoned when a pending work
>> on such workqueue can't be serviced immediately. cpu hotplug
>> callbacks expect workqueues to work during cpu down sequence
>> (usually so that they can flush them), so, to guarantee forward
>> progress, it should be possible to summon emergency workers to
>> !active but online cpus.
>
> If we do the thing suggested in the previous patch, that is move
> clearing active and rebuilding the sched domains until right after
> DOWN_PREPARE, this goes away, right?

Hmmm... yeah, if the usual set_cpus_allowed_ptr() keeps working
throughout CPU_DOWN_PREPARE, this probably goes away. I'll give it a
shot.

>> p2. To migrate back unbound workers when a cpu comes back online.
>> When a cpu goes down, existing workers are unbound from the cpu
>> and allowed to run on other cpus if there still are pending or
>> running works. If the cpu comes back online while those workers
>> are still around, those workers are migrated back and re-bound to
>> the cpu. This isn't strictly required for correctness as long as
>> those unbound workers don't execute works which are newly
>> scheduled after the cpu comes back online; however, migrating back
>> the workers has the advantage of making the behavior more
>> consistent thus avoiding surprises which are difficult to expect
>> and reproduce, and being actually cleaner and easier to implement.
>
> I still don't like this much, if you mark these tasks to simply die when
> the queue is exhausted, and flush the queue explicitly on
> CPU_UP_PREPARE, you should never need to do this.

I don't think flushing from CPU_UP_PREPARE would be a good idea.
There is no guarantee that those works will finish in short (human
scale) time, but we can update cpu_active mask before other
CPU_UP_PREPARE notifiers are executed so that it's symmetrical to cpu
down path and then this problem goes away the exact same way, right?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/