Re: [PATCH v7 0/8] Tunable sched_mc_power_savings=n

From: Ingo Molnar
Date: Tue Dec 30 2008 - 02:21:01 EST



* Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:

> * Ingo Molnar <mingo@xxxxxxx> [2008-12-30 07:21:39]:
>
> >
> > * Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > > > KERNBENCH Runs: make -j4 on a x86 8 core, dual socket quad core cpu
> > > > > package system
> > > > >
> > > > > SchedMC Run Time Package Idle Energy Power
> > > > > 0 81.68 52.83% 54.71% 1.00x J 1.00y W
> > > > > 1 80.70 36.62% 70.11% 0.95x J 0.96y W
> > > > > 2 74.95 19.53% 85.92% 0.90x J 0.98y W
> >
> > > > Your result is very interesting.
> > > > level 2 is more fast and efficient of power.
> > > >
> > > > What's major contributor to use less time in level 2?
> > > > I think it's cache bounce is less time than old.
> > > > Is right ?
> > >
> > > Yes, correct
> >
> > yes, i too noticed that runtime improved so dramatically: +7.5% on
> > kernbench is a _very_ big deal.
> >
>
> Yes, it is, I think one way to identify it on the x86 box would be to
> use the performance counter patches, I'll see it I can get it working.

yeah, good idea, and it's easy to get perf-counters working: just boot
tip/master on a Core2 (or later) Intel CPU [see below about how to use
perfcounters on other CPUs], and pick up timec.c:

http://redhat.com/~mingo/perfcounters/timec.c

then run a kernel build like this:

$ ./timec -e -5,-4,-3,0,1,2,3,4,5 make -j16 bzImage

that's all. You should get an array of metrics like this:

[...]
Kernel: arch/x86/boot/bzImage is ready (#28)

Performance counter stats for 'make':

628315.871980 task clock ticks (msecs)

42330 CPU migrations (events)
124980 context switches (events)
18698292 pagefaults (events)
1351875946010 CPU cycles (events)
1121901478363 instructions (events)
10654788968 cache references (events)
633581867 cache misses (events)
247508675357 branches (events)
21567244144 branch misses (events)

Wall-clock time elapsed: 118348.109066 msecs

I'd guess the CPU migrations, context-switches, cycles+instructions
counters are the most interesting ones. I.e. on a Core2 a good metric is
probably:

$ ./timec -e -5,-4,0,1 make -j16 bzImage

[ If you see any weirdnesses in the numbers, then you should first suspect
some perf-counters bug. Please report any problems to us! ]

If your systems are not Core2 based then patches to extend perfcounters
support to those CPUs are welcome as well ;-)

NOTE: the sw counters (migrations, per task cost) are available on all
CPUs. I.e. you can always do:

$ ./timec -e -5,-4,-3 make -j16 bzImage

regardless of CPU type.

And note that the wall-clock and task-clock results in this case will be
far more accurate than normal 'time make -j16 bzImage' output.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/