Userland Kernel Download Tool

From: dg50@daimlerchrysler.com
Date: Thu Jan 13 2000 - 12:59:07 EST


Hello everyone,

There's quite a bit of pie-in-the-sky here, but I've had an idea that I'd
like to share.

Sitting on my /usr/src partition is a great big whalloping hunk of data -
the Linux kernel sources. Of that data, only a very small part of it is
actually used in the production of the kernel for my machine. Most of that
space is wasted.

Not only that, but as I had to get that data across the network, a great
whalloping bunch of bandwidth was wasted too, and is continually wasted
each time the kernel is updated. Downloading changes as patches instead of
the whole tree helps some, but it is entirely possible that a given upgrade
- say, 2.2.13 to 2.2.14 - might update files that are never actually used
by me. It's possible - not likely, but possible - that the only thing that
really happened during MY kernel upgrade was that the kernel version number
got bumped.

Now take all that wasted storage space and wasted bandwidth, and multiply
it by World Domination and an increasing number of supported platforms. I
think that Linux on IBM Big Iron is wonderful - but I don't want to
download it and keep it on my P233. ;)

This is certainly not a new idea - every once and a while, someone will
suggest Linux be spit up by arch, or by drivers, or by whatever - and then
Linus and a few others will re-state the problems associated with revision
control etc. splitting up the sources would cause, and the idea dies (or
perhaps "is assassinated" :)

I agree completely here. Developing the Kernel should be kept as easy and
as straightforward as possible. Anything that interferes with this design
goal (as even things like CVS would) is tampering with success and is
probably a bad idea.

So anything that is designed to streamline what a user must download
("user" in this case representing a kernel non-developer) must have
near-zero impact on the existing development process. It must be invisible
and transparent.

Here's the idea I have: create a user-space, server-based tool with a
simple API and protocol that sits on top of an uncompressed kernel source
tree. This server-based tool is talked to by a client (it could be a CGI
program, or a desktop program, or a Java applet or whatever) that presents
a little menu to the user. The user makes a few selections that reflect at
a high level what hardware he has, and the server tool builds a little
custom tarball that contains just the sources needed to compile a kernel
for that user. Ideally, it should also touch the "make fooconfig" files
just enough so that options that require sources that are not provided in
the custom tarball are greyed out or not selectable.

Even better would be an option where the user specifies what current
sources he has, and what is generated is a tarball of the sources that have
changed since his last download - and even better yet, _patches_ to sources
he already has.

This need not be an all-singing, all dancing tool right from the get-go
either. In fact, I think a first kick at this cat might have only one user
option - arch. That alone would reduce download sizes and required storage
by a factor of however many main arches there are right now - 6? 7?. Later
versions could be kept general: "I have a SCSI card" "I have a network
card" "I want sound support" as opposed to "I have an Adaptec Model
8000-K1WqA".

What I worry about though is the meta-data that the program would need to
keep; specifically, the maintainence of it. If the source file
"ataptec_8000-K1WqA.c" belongs in the "SCSI card" option tarball, then
something needs to tie that dependancy together - and creating that
dependancy should not rely on someone (like the developer) explicitly
defining it - because if it does, then there will be developers who either
won't do it or will forget to do it or won't know that they are expected to
do it - and then the process falls apart.

Ideally, the meta-data should be implicit in the kernel source tree. If I
want everything in for intel x86, then I grab linux/arch/x86. If I don't
want SCSI, then I don't grab linux/arch/x86/drivers/devices/SCSI etc.

However, I worry that the current src tree may not be organized that
rigidly, that there may be cross-directory dependancies that cannot be
inferred from a given source code file's position in the filesystem. If
that is the case, then either the program must be made aware of these
dependancies (which means kernel devlopers maintaining them - not good) or
the current source tree must be reorganized so that files in lower
directories only depend on files higher up in their filesystem tree (so
linux/arch/x86/drivers/devices/SCSI/adaptec/ataptec_8000-K1WqA.c may depend
on linux/arch/x86/drivers/devices/SCSI/scsi.c but not
linux/arch/x86/drivers/devices/IDE/disks.c)

I think that the "right way to do it" is to re-org (if necessary) the src
filesystem, as it has benefits other than just enabling a nifty user space
utillity. However, this may be a lot of work for what may be very little
visible gain. If, for example, the choice for 2.4 was to either re-org the
file tree or to get a fully robust, high-speed NFS implementation working,
it's hard to argue against choosing fixing NFS over re-orging the source
tree, even if I don't use it myself. :(

Comments?

DG

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:22 EST