Re: Userland Kernel Download Tool

From: James Manning (jmm@raleigh.ibm.com)
Date: Fri Jan 14 2000 - 15:15:26 EST


Warning: 229 lines, but hopefully good content so please bear with me.

[ Friday, January 14, 2000 ] dg50@daimlerchrysler.com wrote:

[snip of the CGI-based kernel builder idea from my prev. email]

> That's even sexier - but there's a host of problems with it:

Definitely :) As I stated before, the config magic would be insane
most likely (unless someone sends me a Kernel-Config.pm anytime soon :)
require the sacrifice of more goats than I could even find in the entire
state of North Carolina.

> 1) The end user doesn't have the source available to play with - any of it.
> The intent behind my idea was not to isolate a kernel "consumer" from the
> build process, but only to restrict the amount of bandwidth consumed by
> downloads, and the amount of storage space consumed by the local
> /usr/src/linux filesysytem.

1) This *really* isolates things and minimizes download. :)

2) The *vast* majority (like it or not) won't care about the source
   (whether they "should" or not is another issue, but sadly even
    computer companies will call a Red Hat before digging into the source
    themselves, as they'd rather just deal with a kernel expert)

3) The source is still quite available :)

4) LXR and similar (McVoy's site, I forget) are still around for people
   to browse through the source, and with a much nicer interface than
   just a dump of files in a dir... if you only care about fs/buffer.c
   then such an interface can get you just that source, very nicely
   linked and prettified, w/o wasting time and bandwidth on the rest
   of the kernel source :)

5) People that want the source and want to build on their own will
   continue to do so, this just helps out the majority of people
   "coming in" who just needs/wants a built kernel of ver X.Y.Z
   which they won't find in a place like rawhide (esp. true for
   dev kernels, 2.3.x in this case) but would like that kind of
   easy-access binary-built access to the development tree (more
   testers is a Good Thing(tm) and the CGI can make a simple
   enough warning in the 2.{3,5,etc}.x dev kernels)

6) No more questions on lkml about "why does it say System too big?"
   as we could either a) always build bzImage, or b) zImage, catch
   the case of too big and bzImage when found (only the chaining and
   piggybacking code would need recompile AFAIK so it'd be fast)

7) This doesn't stop your user-space server idea ... choice is good! :)
   
> 2) Isolating the consumer from the build process means they don't get to
> see the little compile-time buglets that sometimes creep in to production
> kernel sources - which means they don't get an opportunity to either report
> or fix the bug themselves.

Actually, this approach can be even better since:

1) the CGI or kernel building daemon can check for these buglets simple
   enough during the build process, log them, and report them to a)
   the user b) lkml if we choose c) some general "warnings and errors
   in kernel X.Y.Z build process" page where we just log these.
   It gets the kernel a little closer to tinderbox-ish status.

2) We can *definitely* "see" these buglets whereas a person building
   the kernel on their own may just ignore them (or not see them as they
   scroll on by) as long as the kernel binary builds (we've all been
   there, where we cared more about testing a new feature than reporting
   some compiler warning :)

> As well, a given kernel developer may send a
> given user a little mini-patch to see if that fixes a reported problem.
> Downloading a custom precompiled kernel doesn't give the user the ability
> to work with developers to fix problems like they do now. Anything that
> interferes with the current development process is BAD.

Absolutely. I don't want to interfere with the process, but keep in
mind that there's a huge pool of people that don't test on 2.3.x simply
because it's beyond them (and bad builders and bad coders doesn't mean
bad testers... they can report results) and they can't find a built
kernel (rpm-based or otherwise) already in place for them to test with.
This is from experience where benchmarking people I know want to get
a 2.3.latest rpm and just install it... they don't care about building
the kernel per-se, they just want it for their purpose, perhaps to check
against RH 6.1 stock for perf. differences.

Actually, thought, this approach can *enhance* the development
process... a ton of patches on kernel.org (esp. under people/) (or
people.redhat.com, or whatever) don't get tested as much as they "should"
simply because the developer doesn't have the time to be a cheerleader
and tell a ton of people to go try the patch.

With this site, through any of a few diff. methods (being a real
kernel.org mirror, some "* * * * */10" cron job and using LWP::Simple,
etc) the patches developers have made to a kernel could be presented
to a user building a development kernel (with an additional warning,
beyond even the regular "this is a DEV kernel" one).

This means a lower bar of technical requirement to be a dev kernel tester,
and *also* means a larger potential based of testers. While we believe
"it's open source, so the entire world can test" the fact is that a lot
of potential testers find building the kernel intimidating... but they
surf web sites and click options in CGI forms all day long :)

We can even allow a file upload field on the form for patches, which
enables 2 great things:

1) A kernel developer, wishing to make not-so-clued person foo@bar.com
   a tester for his patch, can quickly fill out the form and CGI upload
   his patch, filling in the "email address to be notified" field with
   foo@bar.com (we can add a cc: field for additional notification)

2) People doing embedded or underpowered development (high schooler with
   the 486's around him) can get new kernels much faster... the make -j9
   on the 8-way machine that Penguin Computing is loaning to the project
   (I hope :) will build in around a minute and he can quickly get a
   kernel or test out some quick hack patch.

Side note: Companies (penguin, rhat, suse, lnux, etc) then have a very
  simple method for "giving back to the community", just donate CPU
  cycles to building new kernels. :)

As from before, we can cache modules with the same relevant config
   if we're getting CPU-bound and get a kernel to a person much quicker
   than any other possible method.

> 3) In many cases, you cannot specify the hardware at a fine-grained level
> up front. I have a recent example - my cable modem came with a D-Link PCI
> ethernet card. There is no D-Link Model (whatever it was) option in the
> "make fooconfig" menus. What I wound up doing was peeling a little label
> off a chip on the card, and grep-ing the ethernet card drivers for what
> appeared to be the main chip on the card - which matched a comment in
> nec_2000.c - and then I compiled that option in AND DIDN'T HAVE TO GO
> PESTERING Linux.kernel or whatever. Anything that limits people's abilities
> to help themselves is BAD.

While I can understand your concern, we're just talking about *adding*
a functionality. People like you and I are going to continue doing kernel
fetches (patches, rsync, cvs update, etc) and building the way we do.
I'm not trying to pitch this as a cure-all, but I think it can take
a large step towards helping out the "typical" case that is becoming
more typical every day with the influx of new people (*many* of them,
if not most, coming from Windows, we have to admit).

Look at it this way... we can put a *lot* of kernel education (even
a page for "how to build your own kernel from source) on this site.
It's sure to get a decent number of hits (since people are going to
like the idea of rawhide-ish pre-built kernels of any ver they want)
so we can use this popularity to make a site that *educates* people
about what the kernel is, how to build one, etc.

> 4) Building a kernel is a fairly resource-intensive activity, even for
> powerful machines. Scale that kernel building service by World Domination -
> who's going to pay for the hardware? Especially when this is already done
> on a massively distributed computing basis, by having people compile their
> own kernels?

1) We can limit queue depth, and if too deep, we can simply redirect to
   a page that details how to fetch and build the kernel (from above).
   Alternatively, warn if queue depth is over some number (20) and give a
   time estimate til completion, with a link (or just include on the same
   page) of the information to build your own kernel if they can't wait.

2) This can distribute easily as the backend compute servers could
   be any of an infinite list of machines donating cycles that we do
   a simple loadavg and ping check on before utilizing

3) The percentage of idle cycles in most machines (even powerful ones)
   is insane... ppl *look* to find ways to throw away cycles (seti,
   distributed.net :) so just think about all those cycles Linux
   zealots would be willing to donate to build kernels and help out the
   development process! :)

4) it already takes powerful machines to just be a kernel.org mirror
   (since that # of machines should be proportional to the # of ppl
   wanting/building kernels) and all those are getting space and CPU
   cycles donated... not a foreign concept :)

5) It's a CGI, sell advertising banner space to Linux companies in
   exchange for machines (done all the *time* by *lots* of places)

> It should be stated that "my" tool is not a newbie-handling tool; it's a
> bandwidth and storage space optimiser. The idea is do only download and
> store the minimum set of sources that you need to build your particular
> kernel - and nothing else.

You'd probably have
   a) more acceptance
   b) more widespread use
   c) more space savings
just using something like rsync -z (a very standard util on Linux machines
already :) since both your idea and that would require a uncompressed
kernel src dir... rsync will update drivers/arch's you don't have but
a) this *is* ascii kernel src... it'll compress well with the std zlib
from -z and b) utilizes a better algorithm (unless I misunderstood you)
and should help cut down as much as possible the delta data sent over
(aside from ignoring src files irrelevant to your build :)

I hashed over this at sublogic.com/ideas/index.html#kernel_rsync
already and while it happens in a few isolated spots, it looks like
it just isn't going to take off. Since it seems like the CPU and
data req's are very similar to this idea (both would need at least
one good machine donated at the start to have a base to work from)
that's why I never really thought this would take off, either.

> Incidently, the current patch system does not meet this requirement. The
> current patch system provides deltas between full source trees, not deltas
> between current system requirements.

Quite true, but keep in mind that the lkml FAQ has a nice little rant about
how the bulk of the non-driver code is common, so it's worth noting that
while saving the other arch's is good, it's the sparing of other drivers
that would be the bulk of the savings. (not that it's not a large enough
savings, just worth noting)

Your major concern (as I understand it, please correct me if I'm wrong)
is that this would make ppl using kernels less "development-aware" but
I feel that 1) not any more than having binary rawhide kernels already
does, and 2) we can *use* this CGI as an informative site... one that'll
actaully get read by people (unlike the lkml FAQ, for instance :)

To reiterate, I don't see this as a replacement for your user-space
server approach... on the contrary, I love that idea and simply wanted to
throw this approach out since I had hashed it over before and it seemed
somewhat related :)

James

-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:25 EST