Re: 2.6.18-mm2 boot failure on x86-64

From: Vivek Goyal
Date: Fri Oct 06 2006 - 11:38:24 EST


On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > BIOS-provided physical RAM map:
> > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
>
> I continued what Steve was doing this morning to see could this be
> pinned down. After placing 'CHECK;' in a few places as suggested by
> Andi's check, the problem code was identified as that following in
> mm/bootmem.c#init_bootmem_core()
>
> mapsize = get_mapsize(bdata);
> memset(bdata->node_bootmem_map, 0xff, mapsize);
>
> That explains the value in the array at least. A few more printfs around
> this point printed out the following in the boot log
>
> init_bootmem_core(0, 1909, 0, 12582912)
> init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> AAGH: afinfo corrupted at mm/bootmem.c:121
>
> where;
>
> 1909 == mapstart
> 0 == start
> 12582912 == end
> 1572864 == mapsize
>
> mapstart, start and end being the parameters being passed to
> init_bootmem_core(). This means we are calling memset for the physical
> range 0x775000 -> 0x8F5000 which is in a usable range according to the
> BIOS-e820 map it appears.
>

Hi Mel,

Where is bss placed in physical memory? I guess bss_start and bss_stop
from System.map will tell us. That will confirm that above memset step is
stomping over bss. Then we have to just find that somewhere probably
we allocated wrong physical memory area for bootmem allocator map.

Thanks
Vivek

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/