Re: [PATCH V2] x86: Allow built-in command line to work in early kernel init

From: Matthew Garrett
Date: Wed May 27 2015 - 19:33:12 EST


On Wed, May 20, 2015 at 12:42 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> * Matthew Garrett <mjg59@xxxxxxxxxx> wrote:
> Overall structure looks better now.
>
> One problem is that it's such a huge patch: is there a way to split
> this up into smaller, more obvious, easier to bisect to patches?

Ought to be possible.

> So could we please move this into a separate arch/x86/boot/cmdline.h
> include file, so that this stuff stays separate from the various other
> helpers we nurse in boot.h?

I'll try - there's some awkwardness around how things are #included in
here, and it'll probably have to duplicate a bunch of the boilerplate
from boot.h.

>> +int __cmdline_init(unsigned long cmdline_ptr, struct boot_params *params)
>
> Please add comments describing the function. Why does it have double
> underscores, who is supposed to use it and how, etc. etc?

It seems a little odd to add that when the rest of the file follows
the same style without any documentation.

>> + while (c)
>> + c = rdfs8(cptr+len++);
>> +
>
> So this needs a comment - what do we do here? Skip over to the end of
> the 'cptr' string, using 16-bit ops? Would be nice to have it in a
> suitably named helper inline.

Haha. No, it's far worse than that. I mean, yes, when it's built into
the 16-bit entry point it does that, but when it's built into the
32-bit entry point these are stubbed into nothingness and ugh all of
this is terriblee. But yeah.

>> + len++; /* Trailing space */
>
> that's a trailing zero delimiter byte, not space, right?

No, a space - we need to separate the built-in fragment from
whatever's supplied on the command line.

>> + while (builtin_cmdline[i]) {
>> + len++;
>> + i++;
>> + }
>
> Ditto - is this:
>
> len += strlen(builtin_cmdline+i)
>
> in disguise?

Yes, we don't have a strlen implementation we can use at this point. I
can comment that.

>> + while (c) {
>> + set_fs(new_cmdline_ptr >> 4);
>> + wrfs8(c, nptr++);
>> + set_fs(cmdline_ptr >> 4);
>> + c = rdfs8(cptr++);
>> + }
>> + set_fs(new_cmdline_ptr >> 4);
>> + wrfs8(' ', nptr++);
>> + }
>
> So here we copy from 'cptr' to 'nptr'? Not very well named variables,
> at minimum.

I have the choice between preserving existing style or nice names. I'm
happy to pick either.

>> +
>> + set_fs(new_cmdline_ptr >> 4);
>> +
>> + for (i = 0; builtin_cmdline[i]; i++)
>> + wrfs8(builtin_cmdline[i], nptr++);
>> +
>> + wrfs8('\0', nptr);
>> +
>> + params->hdr.cmd_line_ptr = new_cmdline_ptr;
>> + params->setup_flags |= SETUP_CMDLINE_APPENDED;
>> +
>> + return 0;
>
> So I'd really suggest splitting off the string manipulation parts into
> suitably named helper functions, use better variable named and comment
> it better. That will go a long way to document this non-obvious piece
> of 16-bit magic.

Like I said, this also ends up in the 32-bit code through #include
magic, so there are some awkward corner cases. I'll see what I can do.

>> +}
>> diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
>> index d7b1f65..9765a4a 100644
>> --- a/arch/x86/boot/compressed/aslr.c
>> +++ b/arch/x86/boot/compressed/aslr.c
>> @@ -170,7 +170,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>> mem_avoid[2].size = cmd_line_size;
>>
>> /* Avoid heap memory. */
>> - mem_avoid[3].start = (unsigned long)free_mem_ptr;
>> + mem_avoid[3].start = (unsigned long)orig_free_mem_ptr;
>> mem_avoid[3].size = BOOT_HEAP_SIZE;
>
> it's totally non-obvious what's happening here. Why is it named
> 'orig'?

It's the original base of the heap, before we moved it up so the new
command line had somewhere to live.

>>
>> /* Avoid stack memory. */
>> diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
>> index b68e303..e7a4612 100644
>> --- a/arch/x86/boot/compressed/cmdline.c
>> +++ b/arch/x86/boot/compressed/cmdline.c
>> @@ -1,6 +1,6 @@
>> #include "misc.h"
>>
>> -#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE
>> +#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE || CONFIG_CMDLINE_BOOL
>
> So what's common in all these Kconfig options? That they append
> something to the bootloader provided command line? If yes then please
> create an intermediary helper Kconfig switch that the above options
> select, and only use that helper config here.

No, that they interact with the command line in some way. The first
two parse it.

> So 'cmdline_init()' seems like a poorly chosen interface name. The
> cmdline is already initialized and set up for us by the bootloader.
> What we do here is that in certain configs we append to the cmdline.
> So cmdline_append() might be a better name.

Ok.

> Also, there's very little reason to inline the main API call itself,
> just do the SETUP_CMDLINE_APPENDED from within __cmdline_init and get
> rid of the underscores.

I don't think that works for the 16-bit codepath.

>> /* Fill in upper bits of command line address, NOP on 32 bit */
>> boot_params->ext_cmd_line_ptr = (u64)(unsigned long)cmdline_ptr >> 32;
>>
>> + if (append_cmdline)
>> + boot_params->setup_flags |= SETUP_CMDLINE_APPENDED;
>
> So why not have a boot_params__set_cmdline_appended() helper:
>
> boot_params__set_cmdline_appended(boot_params);
>
> instead of this weird flaggery in the include file:
>
> +#ifdef CONFIG_CMDLINE_OVERRIDE
> +static char append_cmdline;
> +#else
> +static char append_cmdline = 1;
> +#endif

Ok, that's cleaner.

>> + orig_free_mem_ptr = free_mem_ptr = heap; /* Heap */
>> + free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
>> +
>> + cmdline_init();
>> +
>> console_init();
>> debug_putstr("early console in decompress_kernel\n");
>>
>> - free_mem_ptr = heap; /* Heap */
>> - free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
>
> So I don't understand the old heap logic in
> arch/x86/boot/compressed/misc.c, let alone the new one.
>
> So low level boot assembly code sets up the boot_heap, and passes it
> to decompress_kernel() as a 'heap' C function argument. Which then
> stores it in free_mem_ptr and free_mem_end_ptr but does not otherwise
> use it other than some sanity checks.
>
> As per the comments in the code decompression is supposed to use this
> heap, but where does it figure out the address from? It's not passed
> to it in any greppable fashion that I can see.

malloc() in include/linux/decompress/mm.h.

> Confusingly enough, there appears to be another, real mode heap
> mechanism in realmode/rm/stack.S and boot/boot.h, around rm_heap at
> all.

Yup.

> Plus there's both boot/boot.h and include/asm/x86/boot.h, which
> confused me some more.
>
> So this code is highly confusing. It first needs to be unconfused.

I'm not even going there - I don't understand that code well enough to
reimplement it. If that's a dealbreaker then we'll just have to drop
this patch.

>> + /* End of heap check */
>> + init_heap();
>
> So init_heap() could need some extra comments as well.

That's existing code that I'm just moving so the heap exists prior to
us handling the command line. Interfering with it would seem to result
in the patch being more confusing?

>> - __u8 _pad2[4]; /* 0x054 */
>> + __u32 setup_flags; /* 0x054 */
>> __u64 tboot_addr; /* 0x058 */
>> struct ist_info ist_info; /* 0x060 */
>> __u8 _pad3[16]; /* 0x070 */
>
> So data structures should be clearly explained. Who sets this flag,
> when, for what purpose, etc.

That's documented with the rest of the boot protocol.

> Using 'setup' here is highly ambiguous: the whole boot process is an
> act of 'setting up stuff'. So it's very non-obvious what this does, at
> first and second glance as well.

They're flags used during setup - there's only one at the moment, but
any others might end up scattered anywhere over the x86 init process,
so they really do potentially apply to the entire setup process.

> Why is all this in a high level function named setup_arch()? At
> minimum it should be factored out into a suitably named helper
> function.
>
> With that function we can also do an early:
>
> if (!builtin_cmdline)
> return;
>
> which gives us an extra identation level which will fix all those ugly
> line breaks later on. (and ignore checkpatch if it's slightly above
> col80)

Ok.

> So this is the third time I've seen this repeating pattern in this
> patch, with only slight variations.
>
> Could we create a generic helper module that does most of this, with
> small headers defining the various boot environment specialities,
> linked into the various loader functionalities separately?
>
> That would have various advantages, beyond being easier to read: such
> as adding new functionality to our command line string handling could
> be done in a single place.
>
> So something like:
>
> arch/x86/boot/cmdline/core.c
> arch/x86/boot/cmdline/16-bit.h
> arch/x86/boot/cmdline/32-bit.h
> arch/x86/boot/cmdline/efi.h
>
> ... or so, glued into their respective boot stages.

Hrm. Possibly? The way the 16-bit and 32-bit code incorporate each
other make this rather more difficult, but there's some hope.

> Looks like there's very little 'EFI' about this strlcat
> implementation, right? So why don't we use the real one, is it not
> available yet?

Right - we're still in the firmware at this point, not the kernel.

>> +
>> /*
>> * Convert the unicode UEFI command line to ASCII to pass to kernel.
>> * Size of memory allocated return in *cmd_line_len.
>
> Couldn't help but notice the various spelling errors in this comment
> block...

Yeah.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/