Re: [GIT PULL] scheduler fixes

From: Ingo Molnar
Date: Mon May 18 2009 - 15:03:53 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, 18 May 2009, Ingo Molnar wrote:
> > >
> > > hm - i've Cc:-ed Jeff & Viro. The background is that Sparse and LLVM
> > > barfed on the current construct and Al strongly advocated this
> > > solution, see:
> >
> > I know the background.
> >
> > Did you read my email?
> >
> > Did you see my one-line patch that fixes the same problem WITHOUT
> > THE INSANITY?
>
> Yes, i even proposed that very patch in the original discussion, in
> my first reply:
>
> http://lkml.org/lkml/2009/5/8/378

Something like the patch below. It also fixes ->span[] which has a
similar problem.

But ... i think this needs further clean-ups really. Either go fully
static, or go fully dynamic.

I'd suggest we go fully dynamic: the static structures are never
used directly anyway, they are pointer-ized during sched domain
setup.

The reason for this duality is allocation bootstrap: there's no
generic early-capable allocator that allocates something and
switches from bootmem to kmalloc transparently once the SLAB has
been set up.

Would be nice if bootmem_alloc() was extended with such properties -
if SLAB is up (and bootmem is down) it would return
kmalloc(GFP_KERNEL) memory buffers.

Ingo

-------------->
Subject: sched: properly define the sched_group::cpumask and sched_domain::span fields
From: Ingo Molnar <mingo@xxxxxxx>

Properly document the variable-size structure tricks we are doing
wrt. struct sched_group and sched_domain, and use the field[0] GCC
extension instead of defining a vla array.

Dont use unions for this, as pointed out by Linus.

This also un-confuses Sparse and LLVM.

Reported-by: Jeff Garzik <jeff@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
LKML-Reference: <alpine.LFD.2.01.0905180850110.3301@xxxxxxxxxxxxxxxxxxxxx>
Not-Yet-Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
include/linux/sched.h | 25 ++++++++++++++++++++++---
kernel/sched.c | 5 +++--
2 files changed, 25 insertions(+), 5 deletions(-)

Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -846,7 +846,17 @@ struct sched_group {
*/
u32 reciprocal_cpu_power;

- unsigned long cpumask[];
+ /*
+ * The CPUs this group covers.
+ *
+ * NOTE: this field is variable length. (Allocated dynamically
+ * by attaching extra space to the end of the structure,
+ * depending on how many CPUs the kernel has booted up with)
+ *
+ * It is also be embedded into static data structures at build
+ * time. (See 'struct static_sched_group' in kernel/sched.c)
+ */
+ unsigned long cpumask[0];
};

static inline struct cpumask *sched_group_cpus(struct sched_group *sg)
@@ -932,8 +942,17 @@ struct sched_domain {
char *name;
#endif

- /* span of all CPUs in this domain */
- unsigned long span[];
+ /*
+ * Span of all CPUs in this domain.
+ *
+ * NOTE: this field is variable length. (Allocated dynamically
+ * by attaching extra space to the end of the structure,
+ * depending on how many CPUs the kernel has booted up with)
+ *
+ * It is also be embedded into static data structures at build
+ * time. (See 'struct static_sched_domain' in kernel/sched.c)
+ */
+ unsigned long span[0];
};

static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -8049,8 +8049,9 @@ int sched_smt_power_savings = 0, sched_m

/*
* The cpus mask in sched_group and sched_domain hangs off the end.
- * FIXME: use cpumask_var_t or dynamic percpu alloc to avoid wasting space
- * for nr_cpu_ids < CONFIG_NR_CPUS.
+ *
+ * ( See the the comments in include/linux/sched.h:struct sched_group
+ * and struct sched_domain. )
*/
struct static_sched_group {
struct sched_group sg;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/