Re: [PATCH] fix rcu vs hotplug race

From: Paul E. McKenney
Date: Thu Jun 26 2008 - 11:27:49 EST


On Tue, Jun 24, 2008 at 01:01:44PM +0200, Ingo Molnar wrote:
>
> * Gautham R Shenoy <ego@xxxxxxxxxx> wrote:
>
> > > hm, not sure - we might just be fighting the symptom and we might
> > > now create a silent resource leak instead. Isnt a full RCU quiescent
> > > state forced (on all CPUs) before a CPU is cleared out of
> > > cpu_online_map? That way the to-be-offlined CPU should never
> > > actually show up in rcp->cpumask.
> >
> > No, this does not happen currently. The rcp->cpumask is always
> > initialized to cpu_online_map&~nohz_cpu_mask when we start a new
> > batch. Hence, before the batch ends, if a cpu goes offline we _can_
> > have a stale rcp->cpumask, till the RCU subsystem has handled it's
> > CPU_DEAD notification.
> >
> > Thus for a tiny interval, the rcp->cpumask would contain the offlined
> > CPU. One of the alternatives is probably to handle this using
> > CPU_DYING notifier instead of CPU_DEAD where we can call
> > __rcu_offline_cpu().
> >
> > The warn_on that dhaval was hitting was because of some cpu-offline
> > that was called just before we did a local_irq_save inside call_rcu().
> > But at that time, the rcp->cpumask was still stale, and hence we ended
> > up sending a smp_reschedule() to an offlined cpu. So the check may not
> > create any resource leak.
>
> the check may not - but the problem it highlights might and with the
> patch we'd end up hiding potential problems in this area.
>
> Paul, what do you think about this mixed CPU hotplug plus RCU workload?

RCU most certainly needs to work correctly in face of arbitrary sequences
of CPU-hotplug events, and should therefore be tested with arbitrary
CPU-hotplug tests. And RCU also most certainly needs to refrain from
issuing spurious warning messages that might over time be ignored,
possibly causing someone to miss a real bug. My concern with this patch
is in the second spurious-warning area.

Not sure I answered the actual question, though...

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/