Re: get_online_cpus() && workqueues

From: Gautham R Shenoy
Date: Mon Apr 28 2008 - 08:03:59 EST


On Mon, Apr 28, 2008 at 02:56:49PM +0400, Oleg Nesterov wrote:
> On 04/28, Gautham R Shenoy wrote:
> >
> > On Sat, Apr 26, 2008 at 06:43:30PM +0400, Oleg Nesterov wrote:
> > >
> > > Can't we add another nested lock which is dropped right after __cpu_die()?
> > > (in fact I think it could be dropped after __stop_machine_run).
> > >
> > > The new read-lock is get_online_map() (just a random name for now). The only
> > > difference wrt get_online_cpus() is that it doesn't protect against CPU_DEAD,
> > > but most users of get_online_cpus() doesn't need this, they only need a
> > > stable cpu_online_map and sometimes they need to be sure that some per-cpu
> > > object (say, cpu_workqueue_struct->thread) can't be destroyed under this
> > > lock.
> > >
> > > get_online_map() seem to fit for this, and can be used from work->func().
> > > (actually, I think most users of use get_online_cpus() could use the new
> > > helper instead, but this doen't matter).
> >
> > However, subsystems such as cpufreq require serialization with respect
> > to the whole CPU-Hotplug operation since they do initialization and
> > cleanup pre and post the change of the cpu_online_map.
> > The current code, or this patch doesn't help in such cases
> > when such subsystems have multithreaded workqueues!
>
> Yes, I see, thanks. Heiko has pointed this too.
>
> > One of the thoughts I have is to provide an API along the lines of
> > try_get_online_cpus() which will return 1 if there is no CPU Hotplug
> > operation in progress and will return 0 otherwise. In case where
> > a cpu-hotplug operation is in progress, the workitem could simply
> > do nothing other than requeue itself and wait for the cpu-hotplug
> > operation to complete.
>
> Yes, possible, but it is not nice that work->func() can't just use
> get_online_cpus()...

Like I said, it depends on what they want to use it for. If it is just
protection against the changing of the cpu_online_map then, it's simple
as using get_online_map(), i.e the patch you provided.

BTW, the other thing I am concerned about is the
naming. Dont the names get_online_cpus() and get_online_map()
appear very similar. The last thing we want is driver writers getting
confused over what API to use!

>
> > Else, we might want to do something like what slab.c does.
> > It sets the per-cpu work.func of the cpu-going down to NULL in
> > CPU_DOWN_PREPARE.


>
> Yes, but this is different. Please note also that this particular
> work must not use get_online_cpus(), no matter what changes we can
> make. Otherwise cancel_delayed_work_sync() can deadlock.
>
> What do you think about another patch I sent? I am not happy with it,
> and it certainly uglifies cpu.c, but it is simple...

I am currently testing out the patchstack sent
by peterz. Once that's done I will see if I can integrate this patch
with the previous patches and repost the whole series.

>
> Oleg.

--
Thanks and Regards
gautham
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/