Re: Rusty's module talk at the Kernel Summit

From: Keith Owens (kaos@ocs.com.au)
Date: Wed Jul 03 2002 - 07:27:33 EST


On Wed, 3 Jul 2002 00:31:35 -0700,
"Adam J. Richter" <adam@yggdrasil.com> wrote:
>On Wed, 03 Jul 2002 15:01:53 +1000, Keith Owens <kaos@ocs.com.au> wrote:
>>Agreed, so let's look at some real figures. The tar ball below contains
>
>> A patch against kernel 2.5.24 to use init sections for module code
>> and data.
>
>> A patch against modutils 2.4.16 to disable error checks. We are not
>> loading the modules, just getting data about their size.
>
>> A Perl script to read the output from the patched insmod and work out
>> what would be saved by discarding init sections.
>
>> Two reports from running the script against 2.5.24 with everything
>> that will build as a module. One report is from discarding both code
>> and data.init, the other report is discarding just data.init.
>
> Cool. Out of curiosity, is there some reason you need a
>patched version of modutils for extracting this information, rather
>than reading the output of "objdump --section-headers"?

It was easier and more accurate to patch insmod to ignore errors than
to replicate all of insmod's processing in another program. Especially
when insmod adds data to the module as it is loaded, that data does not
appear in objdump -h.

>>The total saving over all 2.5.24 modules is 4% of the total module
>>sizes, rounded to page boundaries.
>
> As individual space optimizations go, 4% is respectable,
>especially for something that has no cost

It is not at no cost. Getting 4% requires arch dependent code to
handle all the tables that are affected by partial text removal. I can
get 2% for nothing by discarding data.init. Discarding text.init is a
lot harder.

>>I don't see that the complexity required to adjust the arch dependent
>>tables is worth the small saving.
>
> I don't follow you. Right now, I don't think one would have
>to write any new kernel code to load init sections and the non-init
>sections as two separate kernel modules, but perhaps I'm probably
>missing something.

The problem is the partial removal of code when there are tables that
point to _all_ the code. Partial code removal requires a lot of work
to adjust every table that refers to code and correct them. To make it
worse, the tables are arch specific. Most architectures have
__ex_table, with different formats for each arch. Some have unwind
data, always arch dependent format. MIPS has dbe.

Data is not referenced by any of these tables so a partial discard of
data is easy, no side effects to worry about.

BTW, this problem exists for removal of __init code from the kernel as
well. The only reason it does not bite us for kernel __init is that
the freed area is not reused for executable code, it is used for
kmalloc so there is no ambiguity caused by the dangling table data.
With modules there is a distinct risk that the freed code area would be
reused for another module.

>>I looked at that several years ago and discarded the idea. There may
>>be references from the init code/data to the main code/data. Those
>>references cannot be resolved until the second module has known
>>addresses, which requires insmod to keep track of two modules at once
>>before either can be loaded.
>
> I do not understand how this is problem. As far as I know,
>there is nothing preventing one from doing two create_module calls
>followed by two init_module calls, so there should be no problem
>allocating the kernel modules. The init module would be loaded first,
>and would not run any initiailzation routine. So, both modules would
>be in kernel memory before any code was run.

It makes insmod much more complicated, it has to load two modules in
parallel with unresolved references in both directions. I have seen
modules with init code that refers to data and rodata (init -> main)
and modules with references from rodata to init (main -> init).

> As I understand it, __ex_table is just for copy_{to,from}_uesr,
>which would almost never be done from init sections

__ex_table is used for any code that requires recovery. Mainly
copy..user but not exclusively.

>The core kernel already deals with the same issue.

It does not. There is no code to adjust any tables after discarding
kernel __init sections. We rely on the fact that the discarded kernel
area is not reused for executable text.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 07 2002 - 22:00:10 EST