Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

From: Prateek Sood
Date: Mon Dec 03 2018 - 14:36:23 EST


On 12/03/2018 12:08 PM, Davidlohr Bueso wrote:
> On 2018-11-30 07:10, Prateek Sood wrote:
>> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
>> acquired for read operation by P1 using percpu_down_read().
>>
>> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
>> is in the process of acquiring cpu_hotplug_lock.
>>
>> P1ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ P2
>> percpu_up_read() pathÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ percpu_down_write() path
>>
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ rcu_sync_enter() //gp_state=GP_PASSED
>>
>> rcu_sync_is_idle() //returns falseÂÂÂÂÂÂÂ down_write(rw_sem)
>>
>> __percpu_up_read()
>>
>> [L] task = rcu_dereference(w->task) //NULL
>>
>> smp_rmb()ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [S] w->task = current
>>
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smp_mb()
>>
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ [L] readers_active_check() //fails
>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ schedule()
>>
>> [S] __this_cpu_dec(read_count)
>>
>> Since load of task can result in NULL. This can lead to missed wakeup
>> in rcuwait_wake_up(). Above sequence violated the following constraint
>> in rcuwait_wake_up():
>>
>> ÂÂÂÂ WAITÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WAKE
>> [S] tsk = currentÂÂÂÂÂ [S] cond = true
>> MB (A)ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ MB (B)
>> [L] condÂÂÂÂÂÂÂÂÂ [L] tsk
>>
>
> Hmm yeah we don't want rcu_wake_up() to get hoisted over the __this_cpu_dec(read_count). The smp_rmb() does not make sense to me here in the first place. Did you run into this scenario by code inspection or you actually it the issue?
>
> Thanks,
> Davidlohr

I have checked one issue where it seems that cpu hotplug code
path is not able to get cpu_hotplug_lock in write mode and there
is a reader pending for cpu hotplug path to release
percpu_rw_semaphore->rwsem to acquire cpu_hotplug_lock.
This caused a deadlock.

>From code inspection also it seems to be not adhering to arm64
smp_rmb() constraint of load/load-store ordering guarantee.


Thanks,
Prateek

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project