Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpuarea

From: Jeremy Fitzhardinge
Date: Mon Jun 30 2008 - 17:08:32 EST


Eric W. Biederman wrote:
Mike Travis <travis@xxxxxxx> writes:

H. Peter Anvin wrote:
Mike Travis wrote:
FYI, I did try this out and it caused the bootloader to scramble the
loaded data. The first corruption I found was the .x86cpuvendor.init
section contained all zeroes.

Explain what you mean with "the bootloader" in this context.

-hpa
After the code was loaded (the compressed code, it seems that my GRUB
doesn't support uncompressed loading), the above section contained
zeroes. I snapped it fairly early, around secondary_startup_64, and
then printed it in x86_64_start_kernel.

The object file had the correct data (as displayed by objdump) so I'm
assuming that the bootloading process didn't load the section correctly.

Below was the linker script I used:

--- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h
+++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h
@@ -373,9 +373,13 @@

#ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU
#define PERCPU(align) \
- . = ALIGN(align); \
+ .data.percpu.abs = .; \
percpu : { } :percpu \
- __per_cpu_load = .; \
+ .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \
+ BYTE(0) \
+ . = ALIGN(align); \
+ __per_cpu_load = .; \
+ } \
.data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \
*(.data.percpu.first) \
*(.data.percpu.shared_aligned) \
@@ -383,8 +387,8 @@
*(.data.percpu.page_aligned) \
____per_cpu_size = .; \
} \
- . = __per_cpu_load + ____per_cpu_size; \
- data : { } :data
+ . = __per_cpu_load + ____per_cpu_size;
+
#else
#define PERCPU(align) \
. = ALIGN(align); \

It showed all the correct address in the map and __per_cpu_load was a
relative symbol (which was the objective.)

Btw, our simulator, which only loads uncompressed code, had the data correct,
so it *may* only be a result of the code being compressed.

Weird. Grub doesn't get involved in the decompression the kernel does it
all itself so we should be able to track where things go bad.

Last I looked the compressed code was formed by essentially.
objcopy vmlinux -O binary vmlinux.bin
gzip vmlinux.bin
And then we take on a magic header to the gzip compressed file.

Are things only bad with the change above?

No, the original crash being discussed was a GP fault in head_64.S as it tries to initialize the kernel segments. The cause was that the prototype GDT is all zero, even though it's an initialized variable, and inspection of vmlinux shows that it has the right contents. But somehow it's either 1) getting zeroed on load, or 2) is loaded to the wrong place.

The zero-based PDA mechanism requires the introduction of a new ELF segment based at vaddr 0 which is sufficiently unusual that it wouldn't surprise me if its triggering some toolchain bug.

Mike: what would happen if the PDA were based at 4k rather than 0? The stack canary would still be at its small offset (0x20?), but it doesn't need to be initialized. I'm not sure if doing so would fix anything, however.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/