Re: [git pull] tracing fixes

From: Ingo Molnar
Date: Fri Jul 18 2008 - 06:36:28 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> > > CFLAGS_REMOVE_sched_clock.o = -pg
> > > +CFLAGS_REMOVE_sched.o = -mno-spe -pg
> > > endif
> > >
> >
> > Ingo,
> >
> > Why not trace the scheduler functions? I found a lot of useful
> > information from seeing what functions are being called (namely the
> > latencies caused by the fair scheduler balancing). Not being able to
> > trace sched.c seems to keep a lot of useful data from being accessed.
>
> i agree in general, but it was causing lockups with:
>
> http://redhat.com/~mingo/misc/config-Thu_Jul_17_13_34_52_CEST_2008
>
> note the MAXSMP in the config which sets NR_CPUS to 4096:
>
> CONFIG_NR_CPUS=4096
>
> our randconfig testing stumbled on it. That is a debug helper to "tune
> up the kernel for as large systems as possible" and can bring in
> regressions not normally seen.

ok, figured it out today: the lockups were due to the NMI watchdog and a
missing NMI protection in cpu_clock(). I've reactivated the topic that
solves this problem area and it all works fine now.

the sched.o change probably made a difference just because it reduced
the cross section between the NMI watchdog and the scheduler, making
lockups less likely during the ftrace self-test. I'll revert it once the
tracing/nmisafe is upstream.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/