Re: [PATCH 03/13] jump label v9: x86 support

From: Ingo Molnar
Date: Fri Jun 11 2010 - 04:13:28 EST



* Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> > > A much better to get smaller kernel images is to do more __cold
> > > annotations for slow paths. Newer gcc will then simply only do -Os for
> > > these functions.
> >
> > That's an opt-in method and we cannot reach the kinds of 30% code size
> > reductions that -Os can achieve. Most code in the kernel is not cache-hot,
> > even on microbenchmarks.
>
> Maybe, maybe not. But yes it can be approached from both ways.

You dont seem to have understood my point: there's a big difference between an
opt-in and an opt-out model.

What you are arguing for is a 'bloaty code generator by default' model and
that model sucks.

Trying to achieve reductions by opt-in marking functions as a 'please reduce
it' __cold marker is a losing battle: most new kernel code is 'cold' and
should be reduced, yet most new code does not (and will not) come with __cold
markers.

The proper model is to assume that everything should be conservatively
size-reduced (because, almost by definition, 90% of new kernel code should
stay small and should stay out of the way), and where benchmarks+importance
proves it we can allow bloatier code generator via __hot.

Important codepaths can get __hot annotations just as much as they are
receiving 'inline' optimizations and other kinds of hand-tuning attention.

> Personally I would prefer to simply write less bloated code to get code
> reductions. Simpler code is often faster too.

You are posing this as an if-else choice, while in reality both should be
done: the best result is to write simpler/faster code _and_ to have a
compact-by-default code generator too ...

> > A much better model would be to actively mark hot codepaths with a __hot
> > attribute instead. Then the code size difference can be considered on a
> > case by case basis.
>
> Yes that works too for those who still use -Os.
>
> e.g. marking the scheduler and a few mm hot paths this way would certain
> make sense.

Possibly, but not without substantiating the rather vague statements you have
made so far.

If you are sending such per function annotation patches then you need to come
up with actual hard numbers as well. One convenient way to measure such things
is a before/after "perf stat --repeat" run - as the noise estimations can be
compared and we can see whether there's a provable effect. (And, of course,
disassembly of GCC suckage is helpful as well.)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/