Re: [PATCH 11/32] nohz/cpuset: Don't turn off the tick if rcu needsit

From: Frederic Weisbecker
Date: Wed Mar 28 2012 - 08:39:19 EST


On Tue, Mar 27, 2012 at 05:21:34PM +0200, Gilad Ben-Yossef wrote:
> On Thu, Mar 22, 2012 at 6:18 PM, Christoph Lameter <cl@xxxxxxxxx> wrote:
> > On Thu, 22 Mar 2012, Gilad Ben-Yossef wrote:
> >
> >> > Is there any way for userspace to know that the tick is not off yet due to
> >> > this? It would make sense for us to have busy loop in user space that
> >> > waits until the OS has completed all processing if that avoids future
> >> > latencies for the application.
> >> >
> >>
> >> I previously suggested having the user register to receive a signal
> >> when the tick
> >> is turned off. Since the tick is always turned off the user task is
> >> the current task
> >> by design, *I think* you can simply mark the signal pending when you
> >> turn the tick off.
> >
> > Ok that sounds good. You would define a new signal for this?
> >
>
> My gut instinct is to let the process register with a specific signal
> (properly the RT range)
> it wants to receive when the tick goes off and/or on.

Note the signal itself could trigger an event that could restart the tick.
Calling call_rcu() is sufficient for that. We can probably optimize that
one day by assigning another CPU to handle the callbacks of a tickless
CPU but for now...

>
> > So we would startup the application. App will do all prep work (memory
> > allocation, device setup etc etc) and then wait for the signal to be
> > received. After that it would enter the low latency processing phase.
> >
> > Could we also get a signal if something disrupts the peace and switches
> > the timer interrupt on again?
> >
>
> I think you'll have to since once you have the tick turned off there
> is no guarantee that
> it wont get turned on by a timer scheduling an task or an IPI.

The problem with this scheme is that if the task is running with the
guarantee that nothing is going to disturb it (it assumes so when it
is notified that the timer is stopped), can it seriously recover from
the fact the timer has been restarted once it gets notified about it?

I have a hard time to imagine that. It's like an RT task running a
critical part that suddenly receives a notification from the kernel that
says "what's up dude? hey by the way you're not real time anymore" :)
How are we recovering from that?

May be instead of focusing on these notifications, we should try hard to
shut down the tick before we reach userspace: delegate RCU work
to another CPU, avoid needless IPIs, avoid needless timer list timers, etc...
Fix those things one by one such that we can configure things to the point we
get closer to a guarantee of CPU isolation.

Does that sound reasonable?


>
>
> --
> Gilad Ben-Yossef
> Chief Coffee Drinker
> gilad@xxxxxxxxxxxxx
> Israel Cell: +972-52-8260388
> US Cell: +1-973-8260388
> http://benyossef.com
>
> "If you take a class in large-scale robotics, can you end up in a
> situation where the homework eats your dog?"
>  -- Jean-Baptiste Queru
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/