Re: [PATCH v7u1 22/31] x86, boot: add fields to support load bzImageand ramdisk above 4G

From: Yinghai Lu
Date: Mon Jan 14 2013 - 00:36:58 EST


On Sun, Jan 13, 2013 at 1:41 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Thu, Jan 03, 2013 at 04:48:42PM -0800, Yinghai Lu wrote:
>> ext_ramdisk_image/size will record high 32bits for ramdisk info.
>>
>> xloadflags bit0 will be set if relocatable with 64bit.
>
> Let's describe that in more detail:
>
> "Bit 0 of xloadflags is set if we are both a 64-bit and a relocatable
> kernel. In that case, it denotes that ramdisk can be loaded above 4Gb."
>
>> Let get_ramdisk_image/size to use ext_ramdisk_image/size to get
>> right positon for ramdisk.
>>
>> bootloader will fill value to ext_ramdisk_image/size when it load
>> ramdisk above 4G.
>>
>> Also bootloader will check if xloadflags bit0 is set to decicde if
>
> decide
>
>> it could load ramdisk high above 4G.
>>
>> sentinel is used to make sure kernel have ext_* valid values set
>
> The explanation of the sentinel field from "-v6" below should be
> actually up here. We absolutely want to have it in the commit message
> *and* in the code so that it is well documented why we've added it.
>
>> Update header version to 2.12.
>>
>> -v2: add ext_cmd_line_ptr for above 4G support.
>> -v3: update to xloadflags from HPA.
>> -v4: use fields from bootparam instead setup_header according to HPA.
>> -v5: add checking for USE_EXT_BOOT_PARAMS
>> -v6: use sentinel to check if ext_* are valid suggested by HPA.
>> HPA said:
>> 1. add a field in the uninitialized portion, call it "sentinel";
>> 2. make sure the byte position corresponding to the "sentinel" field is
>> nonzero in the bzImage file;
>> 3. if the kernel boots up and sentinel is nonzero, erase those fields
>> that you identified as uninitialized;
>
> Question: if the bootloader sets ext_* properly, is it going to set
> sentinel to 0 so that it can signal to the code further on that ext_*
> are valid?

old bootloaders have no idea of sentinel, but if they initialize boot_param
properly that new sentinel will be 0 and new kernel will know.

>
> This is kinda missing from the mechanism of the sentinel and it should
> be documented too.

No, we should have too much duplicated info.

>
>> -v7: change to 0x1ef instead of 0x1f0, HPA said:
>> it is quite plausible that someone may (fairly sanely) start the
>> copy range at 0x1f0 instead of 0x1f1
>
> Right, all those -vX notes are all important and should *definitely* be
> at least in the commit message.

No, I want to keep them in order to track the reviewing progress.

>
>> Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx>
>> Cc: Rob Landley <rob@xxxxxxxxxxx>
>> Cc: Matt Fleming <matt.fleming@xxxxxxxxx>
>> Cc: Gokul Caushik <caushik1@xxxxxxxxx>
>> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
>> Cc: Joe Millenbach <jmillenbach@xxxxxxxxx>
>> ---
>> Documentation/x86/boot.txt | 15 ++++++++++++++-
>> Documentation/x86/zero-page.txt | 4 ++++
>> arch/x86/boot/compressed/cmdline.c | 2 ++
>> arch/x86/boot/compressed/misc.c | 12 ++++++++++++
>> arch/x86/boot/header.S | 12 ++++++++++--
>> arch/x86/boot/setup.ld | 7 +++++++
>> arch/x86/include/uapi/asm/bootparam.h | 13 ++++++++++---
>> arch/x86/kernel/head64.c | 2 ++
>> arch/x86/kernel/setup.c | 4 ++++
>> 9 files changed, 65 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
>> index 406d82d..18ca9fb 100644
>> --- a/Documentation/x86/boot.txt
>> +++ b/Documentation/x86/boot.txt
>> @@ -57,6 +57,9 @@ Protocol 2.10: (Kernel 2.6.31) Added a protocol for relaxed alignment
>> Protocol 2.11: (Kernel 3.6) Added a field for offset of EFI handover
>> protocol entry point.
>>
>> +Protocol 2.12: (Kernel 3.9) Added three fields for loading bzImage and
>> + ramdisk above 4G with 64bit in bootparam.
>
> change to:
>
> "Added three additional fields to bootparam used for loading bzImage and
> ramdisk above 4Gb in 64-bit."

ok

>
>> +
>> **** MEMORY LAYOUT
>>
>> The traditional memory map for the kernel loader, used for Image or
>> @@ -182,7 +185,7 @@ Offset Proto Name Meaning
>> 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
>> 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
>> 0235/1 2.10+ min_alignment Minimum alignment, as a power of two
>> -0236/2 N/A pad3 Unused
>> +0236/2 2.12+ xloadflags Boot protocol option flags
>> 0238/4 2.06+ cmdline_size Maximum size of the kernel command line
>> 023C/4 2.07+ hardware_subarch Hardware subarchitecture
>> 0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
>> @@ -582,6 +585,16 @@ Protocol: 2.10+
>> misaligned kernel. Therefore, a loader should typically try each
>> power-of-two alignment from kernel_alignment down to this alignment.
>>
>> +Field name: xloadflags
>> +Type: modify (obligatory)
>> +Offset/size: 0x236/2
>> +Protocol: 2.12+
>> +
>> + This field is a bitmask.
>> +
>> + Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
>> + - If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
>
> fullstop at the
> end.

ok.

>
>> +
>> Field name: cmdline_size
>> Type: read
>> Offset/size: 0x238/4
>> diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
>> index cf5437d..1140e59 100644
>> --- a/Documentation/x86/zero-page.txt
>> +++ b/Documentation/x86/zero-page.txt
>> @@ -19,6 +19,9 @@ Offset Proto Name Meaning
>> 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
>> 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table)
>> 0B0/010 ALL olpc_ofw_header OLPC's OpenFirmware CIF and friends
>> +0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
>> +0C4/004 ALL ext_ramdisk_size ramdisk_size high 32bits
>> +0C8/004 ALL ext_cmd_line_ptr cmd_line_ptr high 32bits
>> 140/080 ALL edid_info Video mode setup (struct edid_info)
>> 1C0/020 ALL efi_info EFI 32 information (struct efi_info)
>> 1E0/004 ALL alk_mem_k Alternative mem check, in KB
>> @@ -27,6 +30,7 @@ Offset Proto Name Meaning
>> 1E9/001 ALL eddbuf_entries Number of entries in eddbuf (below)
>> 1EA/001 ALL edd_mbr_sig_buf_entries Number of entries in edd_mbr_sig_buffer
>> (below)
>> +1EF/001 ALL sentinel 0: states _ext_* fields are valid
>> 290/040 ALL edd_mbr_sig_buffer EDD MBR signatures
>> 2D0/A00 ALL e820_map E820 memory map table
>> (array of struct e820entry)
>> diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
>> index b4c913c..bffd73b 100644
>> --- a/arch/x86/boot/compressed/cmdline.c
>> +++ b/arch/x86/boot/compressed/cmdline.c
>> @@ -17,6 +17,8 @@ static unsigned long get_cmd_line_ptr(void)
>> {
>> unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
>>
>> + cmd_line_ptr |= (u64)real_mode->ext_cmd_line_ptr << 32;
>> +
>> return cmd_line_ptr;
>> }
>
> On 32-bit, this unsigned long cmd_line_ptr is 4 bytes and the OR doesn't
> have any effect on the final result. You probably want to do:

yes, that is what we want to keep 32bit and 64bit unified.

>
> #ifdef CONFIG_64BIT
> cmd_line_ptr |= (u64)real_mode->ext_cmd_line_ptr << 32;
> #endif
>
> right?
>
> Or instead look at ->sentinel to know whether the ext_* fields are valid
> or not, and save yourself the OR if not.

no.

that is whole point of sentinel, we don't need to check sentinel everywhere
because ext_* are valid.

>
>> int cmdline_find_option(const char *option, char *buffer, int bufsize)
>> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
>> index 88f7ff6..f714576 100644
>> --- a/arch/x86/boot/compressed/misc.c
>> +++ b/arch/x86/boot/compressed/misc.c
>> @@ -318,6 +318,16 @@ static void parse_elf(void *output)
>> free(phdrs);
>> }
>>
>> +static void sanitize_real_mode(struct boot_params *real_mode)
>> +{
>> + if (real_mode->sentinel) {
>> + /* ext_* fields in boot_params are not valid, clear them */
>> + real_mode->ext_ramdisk_image = 0;
>> + real_mode->ext_ramdisk_size = 0;
>> + real_mode->ext_cmd_line_ptr = 0;
>> + }
>> +}
>> +
>> asmlinkage void decompress_kernel(void *rmode, memptr heap,
>> unsigned char *input_data,
>> unsigned long input_len,
>> @@ -325,6 +335,8 @@ asmlinkage void decompress_kernel(void *rmode, memptr heap,
>> {
>> real_mode = rmode;
>>
>> + sanitize_real_mode(real_mode);
>> +
>> if (real_mode->screen_info.orig_video_mode == 7) {
>> vidmem = (char *) 0xb0000;
>> vidport = 0x3b4;
>> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
>> index 8c132a6..0d5790f 100644
>> --- a/arch/x86/boot/header.S
>> +++ b/arch/x86/boot/header.S
>> @@ -279,7 +279,7 @@ _start:
>> # Part 2 of the header, from the old setup.S
>>
>> .ascii "HdrS" # header signature
>> - .word 0x020b # header version number (>= 0x0105)
>> + .word 0x020c # header version number (>= 0x0105)
>> # or else old loadlin-1.5 will fail)
>> .globl realmode_swtch
>> realmode_swtch: .word 0, 0 # default_switch, SETUPSEG
>> @@ -369,7 +369,15 @@ relocatable_kernel: .byte 1
>> relocatable_kernel: .byte 0
>> #endif
>> min_alignment: .byte MIN_KERNEL_ALIGN_LG2 # minimum alignment
>> -pad3: .word 0
>> +
>> +xloadflags:
>> +CAN_BE_LOADED_ABOVE_4G = 1 # If set, the kernel/boot_param/
>> + # ramdisk could be loaded above 4g
>> +#if defined(CONFIG_X86_64) && defined(CONFIG_RELOCATABLE)
>> + .word CAN_BE_LOADED_ABOVE_4G
>> +#else
>> + .word 0
>> +#endif
>>
>> cmdline_size: .long COMMAND_LINE_SIZE-1 #length of the command line,
>> #added with boot protocol
>> diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
>> index 03c0683..9333d37 100644
>> --- a/arch/x86/boot/setup.ld
>> +++ b/arch/x86/boot/setup.ld
>> @@ -13,6 +13,13 @@ SECTIONS
>> .bstext : { *(.bstext) }
>> .bsdata : { *(.bsdata) }
>>
>> + /* sentinel: make sure if boot_params from bootloader is right */
>
> This should say:
>
> /*
> * The bootloader signals the validity of the three ext_* boot params with this.
> */

no, bootloader does not signal that.
old bootloaders have no idea of sentinel, but if they initialize boot_param
properly that new sentinel will be 0 and new kernel will know.

>
>> + . = 495;
>> + .sentinel : {
>> + sentinel = .;
>> + BYTE(0xff);
>> + }
>> +
>> . = 497;
>> .header : { *(.header) }
>> .entrytext : { *(.entrytext) }
>> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
>> index 92862cd..3d8ed8f 100644
>> --- a/arch/x86/include/uapi/asm/bootparam.h
>> +++ b/arch/x86/include/uapi/asm/bootparam.h
>> @@ -58,7 +58,9 @@ struct setup_header {
>> __u32 initrd_addr_max;
>> __u32 kernel_alignment;
>> __u8 relocatable_kernel;
>> - __u8 _pad2[3];
>> + __u8 min_alignment;
>
> Hehe, this is actually a minor bugfix because min_alignment was only
> defined in header.S but was _pad2[0] in the struct definition.
>
>> + __u16 xloadflags;
>> +#define CAN_BE_LOADED_ABOVE_4G (1<<0)
>> __u32 cmdline_size;
>> __u32 hardware_subarch;
>> __u64 hardware_subarch_data;
>> @@ -106,7 +108,10 @@ struct boot_params {
>> __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
>> struct sys_desc_table sys_desc_table; /* 0x0a0 */
>> struct olpc_ofw_header olpc_ofw_header; /* 0x0b0 */
>> - __u8 _pad4[128]; /* 0x0c0 */
>> + __u32 ext_ramdisk_image; /* 0x0c0 */
>> + __u32 ext_ramdisk_size; /* 0x0c4 */
>> + __u32 ext_cmd_line_ptr; /* 0x0c8 */
>> + __u8 _pad4[116]; /* 0x0cc */
>> struct edid_info edid_info; /* 0x140 */
>> struct efi_info efi_info; /* 0x1c0 */
>> __u32 alt_mem_k; /* 0x1e0 */
>> @@ -115,7 +120,9 @@ struct boot_params {
>> __u8 eddbuf_entries; /* 0x1e9 */
>> __u8 edd_mbr_sig_buf_entries; /* 0x1ea */
>> __u8 kbd_status; /* 0x1eb */
>> - __u8 _pad6[5]; /* 0x1ec */
>> + __u8 _pad5[3]; /* 0x1ec */
>> + __u8 sentinel; /* 0x1ef */
>> + __u8 _pad6[1]; /* 0x1f0 */
>
> This needs the -v7 explanation from above as a comment here or somewhere
> around here, for why we've chosen 0x1ef offset.

no, there is no such comment for other fields there.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/