Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup
From: Kohli, Gaurav
Date: Tue Jun 05 2018 - 07:13:56 EST
Hi Peter,
As last mentioned on mail, we are still seeing issue with the latest
approach and below is the susceptible race as mentioned earlier..
controller Thread CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
smpboot_thread_fn
set Task interruptible
wake_up_process
if (!(p->state & state))
goto out;
Kthread_parkme
SET TASK_PARKED
schedule
raw_spin_lock(&rq->lock)
ttwu_remote
waiting for __task_rq_lock
context_switch
finish_lock_switch
Case TASK_PARKED
kthread_park_complete
SET Running
So it seems issue is still their with the latest mentioned fix
kthread, sched/wait: Fix kthread_parkme() completion issue.
Regards
Gaurav
On 5/7/2018 4:53 PM, Kohli, Gaurav wrote:
Corrected the formatting, Sorry for spam.
HI Peter,
We have tested with new patch and still seeing same issue, in this
dumps we don't have debug traces, but seems there still exist race
from code review , Can you please check it once:
Controller ThreadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smpboot_thread_fn
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ set Task interruptible
wake_up_process
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Kthread_parkme
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ SET TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ schedule
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ raw_spin_lock(&rq->lock)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ context_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ finish_lock_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Case TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kthread_park_complete
SET TASK_INTERRUPTIBLE
And also seeing the same warning during unpark of cpuhp from controller:
ÂÂif (!wait_task_inactive(p, state)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WARN_ON(1);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return;
ÂÂÂÂÂÂÂÂ }
325.065893] [<ffffff8920ed0200>] kthread_unpark+0x80/0xd8
[Â 325.065902] [<ffffff8920eab754>] bringup_cpu+0xa0/0x12c
[Â 325.065910] [<ffffff8920eaae90>] cpuhp_invoke_callback+0xb4/0x5c8
[Â 325.065917] [<ffffff8920eabd98>] cpuhp_up_callbacks+0x3c/0x154
[Â 325.065924] [<ffffff8920ead220>] _cpu_up+0x134/0x208
[Â 325.065931] [<ffffff8920ead45c>] do_cpu_up+0x168/0x1a0
[Â 325.065938] [<ffffff8920ead4b8>] cpu_up+0x24/0x30
[Â 325.065948] [<ffffff89215b1408>] cpu_subsys_online+0x20/0x2c
[Â 325.065956] [<ffffff89215aac64>] device_online+0x70/0xb4
[Â 325.065962] [<ffffff89215aad78>] online_store+0xd0/0xdc
[Â 325.065971] [<ffffff89215a7424>] dev_attr_store+0x40/0x54
[Â 325.065982] [<ffffff89210d8a98>] sysfs_kf_write+0x5c/0x74
[Â 325.065988] [<ffffff89210d7b9c>] kernfs_fop_write+0xcc/0x1ec
[Â 325.065999] [<ffffff8921049288>] vfs_write+0xb4/0x1d0
[Â 325.066006] [<ffffff892104a858>] SyS_write+0x60/0xc0
[Â 325.066014] [<ffffff8920e83770>] el0_svc_naked+0x24/0x28
And after this same crash occured:
[Â 325.521307] [<ffffff8920ed4aac>] smpboot_thread_fn+0x26c/0x2c8
[Â 325.527295] [<ffffff8920ecfb24>] kthread+0xf4/0x108
I will put more debug ftraces to check what is going on exactly.
Regards
Gaurav
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.