Re: [ANNOUNCE] "Fast Kernel Headers" Tree -v2

From: Ingo Molnar
Date: Wed Jan 19 2022 - 07:31:49 EST



* Arnd Bergmann <arnd@xxxxxxxx> wrote:

> > I tried to avoid as many low level headers as possible from the main
> > types headers - and the get_order() functionality also brings in bitops
> > definitions, which I'm still hoping to be able to reduce from its
> > current ~95% utilization in a distro kernel ...
>
> Agreed, I think reducing bitops.h and atomic.h usage is fairly important,
> I think these are even bigger on arm64 than on x86.

So what I'm using for 'header complexity metrics' is rather simple: passing
-P -H to the preprocessor: stripping comments & not generating
line-markers, and then counting linecount.

Line-markers should *probably* remain, because the real build is generating
them too - but I wanted to gain a crude & easily available metric to
measure 'first-pass parsing complexity'. That's I think where most of the
header bloat is concentrated: later passes don't really get any of the
unused header definitions passed along. (But maybe this is an invalid
assumption, because compiler warnings do get generated by later passes, and
they are generated for mostly-unused header inlines too.)

If we include comments & line-markers then the bloat goes up by another
~2x:

kepler:~/mingo.tip.git> ./st include/linux/sched.h
#include <linux/sched.h> | LOC: 2,186 | headers: 118
kepler:~/mingo.tip.git> ./st include/linux/sched.h
#include <linux/sched.h> | LOC: 4,092 | headers: 0


> > We could add <linux/page_api.h> as well, as a standardized header. We
> > already have page_types.h and et_order() is a page types API.
>
> More generally speaking, do you have a plan for how to document which
> header to include for getting a particular symbol that is provided by a
> header we don't want to include directly? I think iwyu has a particular
> notation for it, but when I looked at using that in 2020 I decided it
> wouldn't scale to the size of the kernel. I did my own shell script with
> a long list of regex patterns, but I'm not convinced about that approach
> either.

Yeah, I don't think we should do much that hurts general usability of
headers: each symbol has a primary "natural" header, and .c code and other
headers are encouraged but not strictly required to include that.

Thanks,

Ingo