Re: [ANNOUNCE] 3.8-rc6-nohz4

From: Frederic Weisbecker
Date: Fri Feb 08 2013 - 10:51:36 EST


2013/2/7 Ingo Molnar <mingo@xxxxxxxxxx>:
>
> * Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>
>> 2013/2/7 Ingo Molnar <mingo@xxxxxxxxxx>:
>> >
>> > * Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>> >
>> >> I'll reply to this as I come up with comments.
>> >>
>> >> First thing is, don't call it NO_HZ_FULL. A better name would
>> >> be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
>> >> totally remove jiffies :-)
>> >
>> > I don't think we want yet another config option named in a
>> > weird way.
>> >
>> > What we want instead is to just split NO_HZ up into its
>> > conceptual parts:
>> >
>> > CONFIG_NO_HZ_IDLE
>>
>> Renaming CONFIG_NO_HZ to CONFIG_NO_HZ_IDLE is something I
>> considered. I was just worried about this option being present
>> in many defconfig.
>
> I don't think renaming it is an option - it's present not just
> in defconfigs, but in various distro configs, etc.
>
> But we can add new config variables and use the existing
> CONFIG_NO_HZ value to set their default values.

Sure.

>> Note on my tree I stop the tick on both rings. I believe that
>> restarting the tick on kernel entry isn't something we should
>> seriously consider. It would be a costly operation that may
>> make things worse. And in fact there is no big difference.
>> Just kernelspace has more opportunities to be disturbed (RCU
>> IPIs, async timer/work scheduled by the kernel, etc...) and
>> get its tick restarted sometimes.
>
> Ok.
>
> Could we just simplify things and make this an unconditional
> option of NO_HZ? Any reason why we'd want to make this
> configurable, other than debugging?
>
> I'm worried about the proliferation of not easily separable
> config options. We already have way too many timer and scheduler
> options to begin with.

Like Steve said, this is for overhead reasons. The syscall uses the
slow path so that's ok. But we add a callback to every exception, irq
entry/exit, scheduler sched switch, signal handling, user and kernel
preemption point. This all could be lowered using static keys but even
that doesn't make me feel comfortable with this idea.

Moreover, for now this is going to be used only on extreme usecases
such as real time and HPC. If we really have to merge this into an
all-in-one nohz kconfig, I suggest we wait for the feature to mature a
bit and prove that it can be useful further those specialized
workloads, and also that we can ensure it's off-case overhead is not
significant.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/