Re: [PATCH RFC nohz_full v2 6/7] nohz_full: Add full-system-idlestate machine

From: Frederic Weisbecker
Date: Mon Jul 01 2013 - 17:38:43 EST


On Fri, Jun 28, 2013 at 01:10:21PM -0700, Paul E. McKenney wrote:
> +/*
> + * Check to see if the system is fully idle, other than the timekeeping CPU.
> + * The caller must have disabled interrupts.
> + */
> +bool rcu_sys_is_idle(void)

Where is this function called? I can't find any caller in the patchset.

> +{
> + static struct rcu_sysidle_head rsh;
> + int rss = ACCESS_ONCE(full_sysidle_state);
> +
> + WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu);
> +
> + /* Handle small-system case by doing a full scan of CPUs. */
> + if (nr_cpu_ids <= RCU_SYSIDLE_SMALL && rss < RCU_SYSIDLE_FULL) {
> + int cpu;
> + bool isidle = true;
> + unsigned long maxj = jiffies - ULONG_MAX / 4;
> + struct rcu_data *rdp;
> +
> + /* Scan all the CPUs looking for nonidle CPUs. */
> + for_each_possible_cpu(cpu) {
> + rdp = per_cpu_ptr(rcu_sysidle_state->rda, cpu);
> + rcu_sysidle_check_cpu(rdp, &isidle, &maxj);
> + if (!isidle)
> + break;
> + }
> + rcu_sysidle_report(rcu_sysidle_state, isidle, maxj);
> + rss = ACCESS_ONCE(full_sysidle_state);
> + }
> +
> + /* If this is the first observation of an idle period, record it. */
> + if (rss == RCU_SYSIDLE_FULL) {
> + rss = cmpxchg(&full_sysidle_state,
> + RCU_SYSIDLE_FULL, RCU_SYSIDLE_FULL_NOTED);
> + return rss == RCU_SYSIDLE_FULL;
> + }
> +
> + smp_mb(); /* ensure rss load happens before later caller actions. */
> +
> + /* If already fully idle, tell the caller (in case of races). */
> + if (rss == RCU_SYSIDLE_FULL_NOTED)
> + return true;
> +
> + /*
> + * If we aren't there yet, and a grace period is not in flight,
> + * initiate a grace period. Either way, tell the caller that
> + * we are not there yet.
> + */
> + if (nr_cpu_ids > RCU_SYSIDLE_SMALL &&
> + !rcu_gp_in_progress(rcu_sysidle_state) &&
> + !rsh.inuse && xchg(&rsh.inuse, 1) == 0)
> + call_rcu(&rsh.rh, rcu_sysidle_cb);

So this starts an RCU/RCU_preempt grace period to force the global idle
detection.

Would it make sense to create a new RCU flavour instead for this purpose?
Its only per CPU quiescent state would be when the timekeeping CPU ticks
(from rcu_check_callbacks()). The other CPUs would only complete their
QS request through extended quiescent states, ie: only the timekeeping
CPU is burdened.

This way you can enqueue a callback that is executed in the end of the
grace period for that flavour and that callback can help driving the
state machine somehow.

Now may be that's not a good idea because this adds some overhead to
any code that uses for_each_rcu_flavour().


> + return false;
> }
>
> /*
> @@ -2494,6 +2734,21 @@ static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
> {
> }
>
> +static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
> + unsigned long *maxj)
> +{
> +}
> +
> +static bool is_sysidle_rcu_state(struct rcu_state *rsp)
> +{
> + return false;
> +}
> +
> +static void rcu_sysidle_report(struct rcu_state *rsp, int isidle,
> + unsigned long maxj)
> +{
> +}
> +
> static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
> {
> }
> --
> 1.8.1.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/