Re: [PATCH 0/8] sched: Fix hot-unplug regressions

From: Paul E. McKenney
Date: Sat Jan 16 2021 - 12:09:49 EST


On Sat, Jan 16, 2021 at 04:25:58PM +0100, Peter Zijlstra wrote:
> On Sat, Jan 16, 2021 at 12:30:33PM +0100, Peter Zijlstra wrote:
> > Hi,
> >
> > These patches (no longer 4), seems to fix all the hotplug regressions as per
> > nearly a 100 18*SRCU-P runs over-night.
> >
> > I did clean up the patches, so possibly I wrecked it again. I've started new
> > runs and will again leave them running over-night.
>
> Hurph... I've got one splat from this version, one I've not seen before:
>
> [ 68.712848] Dying CPU not properly vacated!
> ...
> [ 68.744448] CPU1 enqueued tasks (2 total):
> [ 68.745018] pid: 14, name: rcu_preempt
> [ 68.745557] pid: 18, name: migration/1
>
> Paul, rcu_preempt, is from rcu_spawn_gp_kthread(), right? Afaict that
> doesn't even have affinity.. /me wonders HTH that ended up on the
> runqueue so late.

Yes, rcu_preempt is from rcu_spawn_gp_kthread(), and you are right that
the kernel code does not bind it anywhere. If this is rcutorture,
there isn't enough of a userspace to do the binding there, eihter.
Wakeups for the rcu_preempt task can happen in odd places, though.

Grasping at straws... Would Frederic's series help? This is in
-rcu here:

cfd941c rcu/nocb: Detect unsafe checks for offloaded rdp
028d407 rcu: Remove superfluous rdp fetch
38e216a rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers
53775fd rcu/nocb: Perform deferred wake up before last idle's need_resched() check
1fbabce rcu/nocb: Trigger self-IPI on late deferred wake up before user resume
2856844 entry: Explicitly flush pending rcuog wakeup before last rescheduling points
4d959df sched: Report local wake up on resched blind zone within idle loop
2617331 entry: Report local wake up on resched blind zone while resuming to user
79acd12 timer: Report ignored local enqueue in nohz mode

I have been including these in all of my tests of your patches.

Thanx, Paul