Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

From: Borislav Petkov
Date: Mon Jun 17 2019 - 06:52:43 EST


On Fri, Jun 14, 2019 at 09:15:18PM +0000, Lendacky, Thomas wrote:
> The memory occupied by the kernel is reserved using memblock_reserve()
> in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
> Everything after __bss_stop must be specifically reserved otherwise it
> is discarded. This is not clearly documented.

Hmm, so I see this in arch/x86/kernel/vmlinux.lds.S after _end:

_end = .;

STABS_DEBUG
DWARF_DEBUG

/* Sections to be discarded */
DISCARDS
/DISCARD/ : {
*(.eh_frame)
}

and over DISCARDS:

/*
* Default discarded sections.
*
* Some archs want to discard exit text/data at runtime rather than
* link time due to cross-section references such as alt instructions,
* bug table, eh_frame, etc. DISCARDS must be the last of output
* section definitions so that such archs put those in earlier section
* definitions.
*/
#define DISCARDS

That sounds like it is documented to me, or do you mean something else?

> Add a new symbol, __end_of_kernel_reserve, that more readily identifies
> what is reserved, along with comments that indicate what is reserved,
> what is discarded and what needs to be done to prevent a section from
> being discarded.
>
> Cc: Baoquan He <bhe@xxxxxxxxxx>
> Cc: Lianbo Jiang <lijiang@xxxxxxxxxx>
> Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
> ---
> arch/x86/include/asm/sections.h | 2 ++
> arch/x86/kernel/setup.c | 8 +++++++-
> arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
> index 8ea1cfdbeabc..71b32f2570ab 100644
> --- a/arch/x86/include/asm/sections.h
> +++ b/arch/x86/include/asm/sections.h
> @@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
> extern char __end_rodata_hpage_align[];
> #endif
>
> +extern char __end_of_kernel_reserve[];
> +
> #endif /* _ASM_X86_SECTIONS_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 08a5f4a131f5..32eb70625b3b 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
>
> void __init setup_arch(char **cmdline_p)
> {
> + /*
> + * Reserve the memory occupied by the kernel between _text and
> + * __end_of_kernel_reserve symbols. Any kernel sections after the
> + * __end_of_kernel_reserve symbol must be explicity reserved with a
> + * separate memblock_reserve() or it will be discarded.

s/it/they/

> + */
> memblock_reserve(__pa_symbol(_text),
> - (unsigned long)__bss_stop - (unsigned long)_text);
> + (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);
>
> /*
> * Make sure page 0 is always reserved because on systems with
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 0850b5149345..ca2252ca6ad7 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -368,6 +368,14 @@ SECTIONS
> __bss_stop = .;
> }
>
> + /*
> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
> + * automatically reserved in setup_arch(). Anything after here must be
> + * explicitly reserved using memblock_reserve() or it will be discarded
> + * and treated as available memory.
> + */
> + __end_of_kernel_reserve = .;
> +
> . = ALIGN(PAGE_SIZE);
> .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
> __brk_base = .;
> @@ -382,7 +390,6 @@ SECTIONS
> STABS_DEBUG
> DWARF_DEBUG
>
> - /* Sections to be discarded */

Huh?

They're called DISCARD* ...

> DISCARDS
> /DISCARD/ : {
> *(.eh_frame)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.