[tip:x86/urgent] x86/mm: Add support for gbpages to kernel_ident_mapping_init()

From: tip-bot for Xunlei Pang
Date: Mon May 08 2017 - 04:08:59 EST


Commit-ID: 66aad4fdf2bf0af29c7decb4433dc5ec6c7c5451
Gitweb: http://git.kernel.org/tip/66aad4fdf2bf0af29c7decb4433dc5ec6c7c5451
Author: Xunlei Pang <xlpang@xxxxxxxxxx>
AuthorDate: Thu, 4 May 2017 09:42:50 +0800
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 8 May 2017 08:28:40 +0200

x86/mm: Add support for gbpages to kernel_ident_mapping_init()

Kernel identity mappings on x86-64 kernels are created in two
ways: by the early x86 boot code, or by kernel_ident_mapping_init().

Native kernels (which is the dominant usecase) use the former,
but the kexec and the hibernation code uses kernel_ident_mapping_init().

There's a subtle difference between these two ways of how identity
mappings are created, the current kernel_ident_mapping_init() code
creates identity mappings always using 2MB page(PMD level) - while
the native kernel boot path also utilizes gbpages where available.

This difference is suboptimal both for performance and for memory
usage: kernel_ident_mapping_init() needs to allocate pages for the
page tables when creating the new identity mappings.

This patch adds 1GB page(PUD level) support to kernel_ident_mapping_init()
to address these concerns.

The primary advantage would be better TLB coverage/performance,
because we'd utilize 1GB TLBs instead of 2MB ones.

It is also useful for machines with large number of memory to
save paging structure allocations(around 4MB/TB using 2MB page)
when setting identity mappings for all the memory, after using
1GB page it will consume only 8KB/TB.

( Note that this change alone does not activate gbpages in kexec,
we are doing that in a separate patch. )

Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Dave Young <dyoung@xxxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
Cc: akpm@xxxxxxxxxxxxxxxxxxxx
Cc: kexec@xxxxxxxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/1493862171-8799-1-git-send-email-xlpang@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/boot/compressed/pagetable.c | 2 +-
arch/x86/include/asm/init.h | 3 ++-
arch/x86/kernel/machine_kexec_64.c | 2 +-
arch/x86/mm/ident_map.c | 14 +++++++++++++-
arch/x86/power/hibernate_64.c | 2 +-
5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 56589d0..1d78f17 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -70,7 +70,7 @@ static unsigned long level4p;
* Due to relocation, pointers must be assigned at run time not build time.
*/
static struct x86_mapping_info mapping_info = {
- .pmd_flag = __PAGE_KERNEL_LARGE_EXEC,
+ .page_flag = __PAGE_KERNEL_LARGE_EXEC,
};

/* Locates and clears a region for a new top level page table. */
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 737da62..474eb8c 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -4,8 +4,9 @@
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
void *context; /* context for alloc_pgt_page */
- unsigned long pmd_flag; /* page flag for PMD entry */
+ unsigned long page_flag; /* page flag for PMD or PUD entry */
unsigned long offset; /* ident mapping offset */
+ bool direct_gbpages; /* PUD level 1GB page support */
};

int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 085c3b3..1d4f2b0 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -113,7 +113,7 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
struct x86_mapping_info info = {
.alloc_pgt_page = alloc_pgt_page,
.context = image,
- .pmd_flag = __PAGE_KERNEL_LARGE_EXEC,
+ .page_flag = __PAGE_KERNEL_LARGE_EXEC,
};
unsigned long mstart, mend;
pgd_t *level4p;
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 04210a2..adab159 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -13,7 +13,7 @@ static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page,
if (pmd_present(*pmd))
continue;

- set_pmd(pmd, __pmd((addr - info->offset) | info->pmd_flag));
+ set_pmd(pmd, __pmd((addr - info->offset) | info->page_flag));
}
}

@@ -30,6 +30,18 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
if (next > end)
next = end;

+ if (info->direct_gbpages) {
+ pud_t pudval;
+
+ if (pud_present(*pud))
+ continue;
+
+ addr &= PUD_MASK;
+ pudval = __pud((addr - info->offset) | info->page_flag);
+ set_pud(pud, pudval);
+ continue;
+ }
+
if (pud_present(*pud)) {
pmd = pmd_offset(pud, 0);
ident_pmd_init(info, pmd, addr, next);
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index 6a61194..a6e21fe 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -104,7 +104,7 @@ static int set_up_temporary_mappings(void)
{
struct x86_mapping_info info = {
.alloc_pgt_page = alloc_pgt_page,
- .pmd_flag = __PAGE_KERNEL_LARGE_EXEC,
+ .page_flag = __PAGE_KERNEL_LARGE_EXEC,
.offset = __PAGE_OFFSET,
};
unsigned long mstart, mend;