Re: [PATCH RFC nohz_full 2/7] nohz_full: Add rcu_dyntick data forscalable detection of all-idle state

From: Frederic Weisbecker
Date: Sun Aug 04 2013 - 21:26:59 EST


On Fri, Jul 26, 2013 at 04:19:19PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
>
> This commit adds fields to the rcu_dyntick structure that are used to
> detect idle CPUs. These new fields differ from the existing ones in
> that the existing ones consider a CPU executing in user mode to be idle,
> where the new ones consider CPUs executing in user mode to be busy.
> The handling of these new fields is otherwise quite similar to that for
> the exiting fields. This commit also adds the initialization required
> for these fields.
>
> So, why is usermode execution treated differently, with RCU considering
> it a quiescent state equivalent to idle, while in contrast the new
> full-system idle state detection considers usermode execution to be
> non-idle?
>
> It turns out that although one of RCU's quiescent states is usermode
> execution, it is not a full-system idle state. This is because the
> purpose of the full-system idle state is not RCU, but rather determining
> when accurate timekeeping can safely be disabled. Whenever accurate
> timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one
> CPU must keep the scheduling-clock tick going. If even one CPU is
> executing in user mode, accurate timekeeping is requires, particularly for
> architectures where gettimeofday() and friends do not enter the kernel.
> Only when all CPUs are really and truly idle can accurate timekeeping be
> disabled, allowing all CPUs to turn off the scheduling clock interrupt,
> thus greatly improving energy efficiency.
>
> This naturally raises the question "Why is this code in RCU rather than in
> timekeeping?", and the answer is that RCU has the data and infrastructure
> to efficiently make this determination.

Right, and it's somehow disturbing that this code is in RCU but yeah the
infrastructure is there.

It would be perhaps more neat to have a specific RCU flavour for which the
only quiescent state is when the system is fully idle. But like you said
that's some overhead to iterate another RCU flavor, while we can reuse rcu
traditional flavour as an opportunity since it often handle callbacks
around. Too bad.

Anyway, Acked-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/