Re: kexec reboot code buffer

From: Eric W. Biederman (ebiederm@xmission.com)
Date: Wed Jan 29 2003 - 10:41:44 EST


"Martin J. Bligh" <mbligh@aracnet.com> writes:

> >> We talked about creating a new zone specifically for DMA32 (ie <4Gb)
> >> for other reasons, but it's not there as yet.
> >
> > Right. And because of that I don't feel bad about asking for a zone
> > that ends at 4GB, as it is a fairly general need in the kernel, even
> > if the rest of the interfaces have a little catching up to do before
> > the can use it. Although with IOMMUs I don't know how much such a
> > DMA32 zone is worth.
>
> It's probably too late for that sort of thing now in 2.5 though.

Unless there are balancing issues adding a new zone is very trivial.
However any code in that direction will be an additional patch so
because ZONE_NORMAL works for most cases. Alan Cox can have his high
memory case if he wants it.

> > I am fine with memory that is not physically contiguous. The memory
> > I really want the kernel is currently sitting on.....
>
> Oh, in that case you should have no problem getting it from ZONE_NORMAL,
> especially if you can wake up kswapd and wait for a few seconds.

Nope, kswapd will not free the kernels text segment. So in practice
I can use anything below 4GB.
  
> > The largest I have heard of is currently is 96MB. Typical is
>
> Eeek! ;-)

There is even a distribution built to be run completely out of a ramdisk.
http://warewulf-cluster.org/
 
> > somewhere between 900K and 6MB. You get some interesting
> > kernel+ramdisk combinations when people are network booting a diskless
> > system. Theoretically I can accommodate a nearly 4GB image, with the
> > current code structure.
>
> Personnally I don't have a problem setting aside that much space at boot
> time, but it's probably not a good solution for small boxes.
>
> > So the 4GB instead of 960MB limit, and not pesimizing the kernel for
> > the cases where the new image sits in ram for a while (kexec on panic)
> > is while I modified my code to use high memory.
>
> Maybe IFF you want to suppork kexec on panic, it should be statically
> reserved at boot time? You don't want to be mucking around in the panic
> path trying to swap out memory, etc. when your kernel is halfway down
> the toilet already ... and that has nothing to do with memory placement,
> it's just a space issue.

As it currently exists kexec happens in two stages sys_kexec_load that
scatters the new image through out memory, with some care so that I can
use memcpy, and some variant of memmove to place the image at it's
final location in memory. This is where all of the memory allocation
happens.

Then there is sys_reboot(LINUX_REBOOT_CMD_KEXEC), (or a kernel panic)
which triggers the work. And except for cpu state changes nothing
happens.

> > The nasty case comes with highmemory when I am allocating memory on a
> > 32GB NUMA box and am allocating memory on the wrong node. In which
> > case my code needs to allocate 28GB before it starts getting the
> > memory it wants.
>
> Oh, just do alloc_pages_node(0) (works on non NUMA as well, will just
> fall back). But I can show you a 32Gb SMP box as well ;-) ZONE_NORMAL
> is probably still easiest, and most general.

Right it is not hard to do it just takes a little more work.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jan 31 2003 - 22:00:22 EST