Re: get_online_cpus() from a preemptible() context (bug?)

From: James Morse
Date: Wed Nov 08 2017 - 11:09:08 EST


Hi Peter,

On 06/11/17 21:07, Peter Zijlstra wrote:
> On Mon, Nov 06, 2017 at 06:51:35PM +0000, James Morse wrote:
>>> If you look at percpu_down_read(), you'll note it'll disable preemption
>>> before calling __percpu_down_read().
>>
>> Yes, this is how __percpu_down_read() protects the combination of it's fast/slow
>> paths.
>>
>> But next percpu_down_read() calls preempt_enable(), I can't see what stops us
>> migrating before percpu_up_read() preempt_disable()s to call __this_cpu_dec(),
>> which now affects a different variable.
>>
>
> Ah, so the two operations that comment talks about are:
>
> percpu_down_read_preempt_disable()
> preempt_disable();
> 1) __this_cpu_inc(*sem->read_count);
> if (unlikely(!rcu_sync_is_idle(&sem->rss)))
> __percpu_down_read()
> smp_mb()
> if (likely(!smp_load_acquire(&sem->readers_block))) // false
> __percpu_up_read()
> smp_mb()
> 2) __this_cpu_dec(*sem->read_count);
> rcuwait_wake_up(&sem->writer);
> preempt_enable_no_resched();
>
> If you want more detail on this, I'll actually have to go think :-)

I think this was the answer to a much smarter question than mine!

I've tried (and failed) to break it instead. To answer my own question:

I thought this was potentially-broken because the __this_cpu_{add,dec}() out in
{get,put}_online_cpus() will operate on different per-cpu read_count variables
if we migrate. (not the pair above)

This isn't a problem as the only thing that reads the read_count is
readers_active_check(), which per_cpu_sum()s them all together before comparing
against zero. As they are all unsigned-ints it uses unsigned-overflow to do the
right thing. This even works if a CPU holding a vital part of the read_count is
offline, as per_cpu_sum() uses for_each_possible_cpu().


Thanks!

James