Re: How to turn scheduler tick on for current nohz_full CPU?
From: Paul E. McKenney
Date: Sun Aug 18 2019 - 22:20:53 EST
On Tue, Jul 30, 2019 at 10:36:37AM -0700, Paul E. McKenney wrote:
> On Tue, Jul 30, 2019 at 06:43:10PM +0200, Frederic Weisbecker wrote:
> > On Mon, Jul 29, 2019 at 03:32:38PM -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 24, 2019 at 06:12:43PM -0700, Paul E. McKenney wrote:
> > >
> > > The patch below (which includes your patch) does help considerably.
> > > However, it does have some shortcomings:
> > >
> > > 1. Adds an atomic operation (albeit a cache-local one) to
> > > the scheduler fastpath. One approach would be to have
> > > a way of testing this bit and clearing it only if set.
> > >
> > > Another approach would be to instead clear it on the
> > > transition to nohz_full userspace or to idle.
> >
> > Well, the latter would be costly as it is going to restart the tick on every
> > user -> kernel transitions.
>
> You lost me on this one. I would be turning off RCU's request to
> maintain the tick on transition to nohz_full userspace or to idle.
> Why would the tick get turned on by a later user->kernel transition?
>
> > > 2. There are a lot of other places in the kernel that are in
> > > need of this bit being set. I am therefore considering making
> > > multi_cpu_stop() or its callers set this bit on all CPUs upon
> > > entry and clear it upon exit. While in this state, it is
> > > likely necessary to disable clearing this bit. Or it would
> > > be necessary to make multi_cpu_stop() repeat clearing the bit
> > > every so often.
> > >
> > > As it stands, I have CPU hotplug removal operations taking
> > > more than 400 seconds.
> > >
> > > 3. It was tempting to ask for this bit to be tracked on a per-task
> > > basis, but from what I can see that adds at least as much
> > > complexity as it removes.
> >
> > Yeah I forgot to answer, you can use tick_dep_set_task() for that.
>
> Ah, good, that would remove my need to clear things on the scheduler
> fastpaths. My guess is that I use both the per-CPU and the per-task
> variant in different places, but testing will tell! ;-)
And after a few bug fixes (one key fix for an embarrassing bug from Joel),
this now passes 14-hour rcutorture on the scenario that used to get RCU
CPU stall warnings once or twice per hour. Though still not material for
the upcoming merge window.
Thank you, everyone!
Thanx, Paul