Re: [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF fromboot-time parameter

From: Josh Triplett
Date: Fri Jun 15 2012 - 17:43:20 EST


On Fri, Jun 15, 2012 at 02:05:56PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
>
> Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
> than a fixed constant makes it easier for people to decrease cache-miss
> overhead for large systems, it is of little help for people who must
> run a single pre-built kernel binary.
>
> This commit therefore allows the value of RCU_FANOUT_LEAF to be
> increased (but not decreased!) via a boot-time parameter named
> rcutree.rcu_fanout_leaf.
>
> Reported-by: Mike Galbraith <efault@xxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> ---
> Documentation/kernel-parameters.txt | 4 ++
> kernel/rcutree.c | 97 ++++++++++++++++++++++++++++++-----
> kernel/rcutree.h | 23 +++++----
> kernel/rcutree_plugin.h | 4 +-
> kernel/rcutree_trace.c | 2 +-
> 5 files changed, 104 insertions(+), 26 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c45513d..88bd3ef 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2367,6 +2367,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> Set maximum number of finished RCU callbacks to process
> in one batch.
>
> + rcutree.fanout_leaf= [KNL,BOOT]
> + Set maximum number of finished RCU callbacks to process
> + in one batch.

Copy-paste problem.

> rcutree.qhimark= [KNL,BOOT]
> Set threshold of queued
> RCU callbacks over which batch limiting is disabled.
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 0da7b88..a151184 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -60,17 +60,10 @@
>
> /* Data structures. */
>
> -static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
> +static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];

I assume that the requirement to only increase the fanout and never
decrease it comes from the desire to not increase the sizes of all of
these arrays to MAX_RCU_LVLS?

> +/*
> + * Compute the rcu_node tree geometry from kernel parameters. This cannot
> + * replace the definitions in rcutree.h because those are needed to size
> + * the ->node array in the rcu_state structure.
> + */
> +static void __init rcu_init_geometry(void)
> +{
> + int i;
> + int j;
> + int n = NR_CPUS;
> + int rcu_capacity[MAX_RCU_LVLS + 1];
> +
> + /* If the compile-time values are accurate, just leave. */
> + if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF)
> + return;
> +
> + /*
> + * Compute number of nodes that can be handled an rcu_node tree
> + * with the given number of levels. Setting rcu_capacity[0] makes
> + * some of the arithmetic easier.
> + */
> + rcu_capacity[0] = 1;
> + rcu_capacity[1] = rcu_fanout_leaf;
> + for (i = 2; i <= MAX_RCU_LVLS; i++)
> + rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
> +
> + /*
> + * The boot-time rcu_fanout_leaf parameter is only permitted
> + * to increase the leaf-level fanout, not decrease it. Of course,
> + * the leaf-level fanout cannot exceed the number of bits in
> + * the rcu_node masks. Finally, the tree must be able to accommodate
> + * the configured number of CPUs. Complain and fall back to the
> + * compile-timer values if these limits are exceeded.

Typo: s/timer/time/

> + */
> + if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
> + rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
> + n > rcu_capacity[4]) {

4 seems like a magic number here; did you mean MAX_RCU_LVLS or similar?

Also, why have n as a variable when it never changes?

> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -42,28 +42,28 @@
> #define RCU_FANOUT_4 (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
>
> #if NR_CPUS <= RCU_FANOUT_1
> -# define NUM_RCU_LVLS 1
> +# define RCU_NUM_LVLS 1

I assume you made this change to make it easier to track down all the
uses of the macro to change them; however, having now done so, the
change itself seems rather gratuitous, and inconsistent with the other
macros. Would you consider changing it back?

> +extern int rcu_num_lvls;
> +extern int rcu_num_nodes;

Given the above, you might also want to change these for consistency.

Also, have you checked the various loops using these variables to figure
out if GCC emits less optimal code now that it can't rely on a
compile-time constant? I don't expect it to make much of a difference,
but it seems worth checking.

You might also consider marking these as __read_mostly, at a minimum.

> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 2411000..e9b44c3 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -68,8 +68,10 @@ static void __init rcu_bootup_announce_oddness(void)
> printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
> #endif
> #if NUM_RCU_LVL_4 != 0
> - printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n");
> + printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");

This change seems entirely unrelated to this patch. Seems simple enough
to split it into a separate one-line patch ("Mark four-level hierarchy
as no longer experimental").

> #endif
> + if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
> + printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");

You might consider printing rcu_fanout_leaf in this message.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/